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REGULATION OF GENE EXPRESSION IN PLANTS 

This invention relates to methods of modulating 
the expression of desired genes in plants, and to DNA 
5 sequences and genetic constructs for use in these methods. 
In particular, the invention relates to methods and 
constructs for targeting of expression specifically to the 
endosperm of the seeds of cereal plants such as wheat, and 
for modulating the time of expression in the target tissue. 

10 This is achieved by the use of promoter sequences from 

enzymes of the starch biosynthetic pathway. In a preferred 
embodiment of the invention, the sequences and/or promoters 
are those of starch branching enzyme I, starch branching 
enzyme II, soluble starch synthase I, and starch debranching 

15 enzyme, all derived from Triticum tauschii , the D genome 
donor of hexaploid bread wheat. 

A further preferred embodiment relates to a method 
of identifying variations in the characteristics of plants. 

2 0 BACKGROUND OF THE INVENTION 

Starch is an important constituent of cereal 
grains and of flours, accounting for about 65-67% of the 
weight of the grain at maturity. It is produced in the 
amyloplast of the grain endosperm by the concerted action of 
25 a number of enzymes, including ADP-Glucose pyrophosphorylase 
(EC 2.7.7.27), starch synthases (EC 2.4.1.21), branching 
enzymes (EC 2.4.1.18) and debranching enzymes (EC 3.2.1.41 
and EC 3.2.1.68) (Ball et al , 1996; Martin and Smith, 1995; 
Morell et al, 1995) . Some of the proteins involved in the 

3 0 synthesis of starch can be recovered from the starch 

granule (Denyer et al, 1995; Rahman et al, 1995). 

Most wheat cultivars normally produce starch 
containing 25% amylose and 75% amylopectin. Amylose is 
composed of large linear chains of a (1-4) linked a-D- 
3 5 glucopyranosyl residues, whereas amylopectin is a branching 
form of a-glycan linked by a (1-6) linkages. The ratio of 
amylose and amylopectin, the branch chain length and the 
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number of branch chains of amylopectin are the major factors 
which determine the properties of wheat starch. 

Starch with various properties has been widely 
used in industry, food science and medical science. High 
5 amylose wheat can be used for plastic substitutes and in 
paper manufacture to protect the environment; in health 
foods to reduce bowel cancer and heart disease; and in 
sports foods to improve the athletes' performance. High 
amylopectin wheat may be suitable for Japanese noodles, and 

10 is used as a thickener in the food industry. 

Wheat contains three sets of chromosomes (A, B and 
D) in its very large genome of about 10 10 base pairs (bp) . 
The donor of the D genome to wheat is Triticum tauschii , and 
by using a suitable accession of this species the genes from 

15 the D genome can be studied separately (Lagudah et al , 
1991) . 

There is comparatively little variation in starch 
structure found in wheat varieties, because the hexaploid 
nature of wheat prevents mutations from being readily 

20 identified. Dramatic alterations in starch structure are 

expected to require the combination of homozygous recessive 
alleles from each of the 3 wheat genomes, A, B and D. This 
requirement renders the probability of finding such mutants 
in natural or mutagenised populations of wheat very low. 

25 Variation in wheat starch is desirable in order to enable 
better tailoring of wheat starches for processing and end- 
user requirements . 

Key commercial targets for the manipulation of 
starch biosynthesis are: 

30 1. "Waxy" wheats in which amylose content is 

decreased to insignificant levels. This outcome is expected 
to be obtained by eliminating granule-bound starch synthase 
activity . 

2. High amylose wheats, expected to be obtained 
35 by suppressing starch branching enzyme-II activity. 

3. Wheats which continue to synthesise starch 
at elevated temperatures, expected to be obtained by 
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identifying or introducing a gene encoding a heat-stable 
soluble starch synthase. 

4. "Sugary types" of wheat which contain 
increased amylose content and free sugars, expected to be 
5 obtained by manipulating an isoamylase- type debranching 
enzyme . 

There are two general strategies which may be used 
to obtain wheats with altered starch structure: 

(a ) using genetic engineering strategies to 

10 suppress the activity of a specific gene, or to introduce a 
novel gene into a wheat line; and 

(b) selecting among existing variation in wheat for 
missing ("null") or altered alleles of a gene in 
each of the genomes of wheat, and combining 

15 these by plant breeding. 

However, in view of the complexity of the gene families, 
particularly starch branching enzyme I (SBE I) , without the 
ability to target regions which are unique to genes 
expressed in endosperm, modification of wheat by combination 

20 of null alleles of several enzymes in general represents an 
almost impossible task. 

Branching enzymes are involved in the production 
of glucose a-1,6 branches. Of the two main constituents of 
starch, amylose is essentially linear, but amylopectin is 

2 5 highly branched; thus branching enzymes are thought to be 
directly involved in the synthesis of amylopectin but not 
amylose. There are two types of branching enzymes in plants 
, starch branching enzyme I (SBE I) and starch branching 
enzyme II (SBE II), and both are about 85 kDa in size. At 

30 the nucleic acid level there is about 65% sequence identity 
between types I and II in the central portion of the 
molecules; the sequence identity between SBE I from 
different cereals is about 85% overall (Burton et al, 1995; 
Morell et al, 1995) . 

35 In cereals, SBE I genes have so far been reported 

only for rice (Kawasaki et al , 1991; Rahman et al, 1997) . A 
cDNA sequence for wheat SBE I is available on the GenBank 
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database (Accession No. Y12320; Repellin A., Nair R.B., Baga 
M., and Chibbar R.N.: Plant Gene Register PGR97-094, 1997). 
As far as we are aware, no promoter sequence for wheat SBE I 
has been reported. 
5 We have characterised an SBE I gene, designated 

wSBE I-D2, from Triticum tauschii , the donor of the D genome 
to wheat (Rahman et al, 1997) . This gene encoded a protein 
sequence which had a deletion of approximately 65 amino 
acids at the C-terminal end, and appeared not to contain 
10 some of the conserved amino acid motifs characteristic of 
this class of enzyme (Svensson, 1994) . Although wSBE I-D2 
was expressed as mRNA, no corresponding protein has yet been 
found in our analysis of SBE I isoforms from the endosperm, 
and thus it is possible that this gene is a transcribed 
15 pseudogene . 

Genes for SBE II are less well characterised; no 
genomic sequences are available, although SBE II cDNAs from 
rice (Mizuno et al, 1993; Accession No. D16201) and maize 
(Fisher et al , 1993; Accession No. L08065) have been 
20 reported. In addition, a cDNA sequence for SBE II from 
wheat is available on the GenBank database (Nair et al, 
1997; Accession No. Y11282); although the sequences are very 
similar to those reported herein, there are differences near 
the N-terminal of the protein, which specifies its 
25 intracellular location. No promoter sequences have been 
reported, as far as we are aware. 

Wheat granule-bound starch synthase (GBSS) is 
responsible for amylose synthesis, while wheat branching 
enzymes together with soluble starch synthases are 
30 considered to be directly involved in amylopectin 

biosynthesis. A number of isoforms of soluble and granule- 
bound starch synthases have been identified in developing 
wheat endosperm (Denyer et al, 1995) . There are three 
distinct isoforms of starch synthases, 60 kDa, 75-77 kDa and 
35 100-105 kDa, which exist in the starch granules (Denyer et 
al, 1995; Rahman et al, 1995) . The 60 kDa GBSS is the 
product of the wx gene. The 75-77 kDa protein is a wheat 
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soluble starch synthase I (SSSI) which is present in both 
the soluble fraction and the starch granule-bound fraction 
of the endosperm. However, the 100-105 kDa proteins, which 
are another type of soluble starch synthase, are located 
5 only in starch granules (Denyer et al, 1995; Rahman et al, 
1995) . To our knowledge there has been no report of any 
complete wheat SSS I sequence, either at the protein or the 
nucleotide level. 

Both cDNA and genomic DNA encoding a soluble 

10 starch synthase I of rice have been cloned and analysed 

(Baba et al, 1993; Tanaka et al, 1995). The cDNAs encoding 
potato soluble starch synthase SSSII and SSSIII and pea 
soluble starch synthase SSSII have also been reported 
(Edwards et al, 1995; Marshall et al, 1996; Dry et al, 

15 1992) . However, corresponding full length cDNA sequences for 
wheat have hitherto not been available, although a partial 
cDNA sequence (Accession No. U48227) has been released to 
the GenBank database . 

Approach (b) referred to above has been 

20 demonstrated for the gene for granule-bound starch synthase. 
Null alleles on chromosomes 7A, 7D and 4A were identified by 
the analysis of GBSS protein bands by electrophoresis, and 
combined by plant breeding to produce a wheat line 
containing no GBSS , and no amylose (Nakamura et al , 1995). 

25 Subsequently, PGR-based DNA markers have been identified, 
which also identify null alleles for the GBSS loci on each 
of the three wheat genomes. Despite the availability of a 
considerable amount of information in the prior art, major 
problems remain. Firstly, the presence of three separate 

30 sets of chromosomes in wheat makes genetic analysis in this 
species extraordinarily complex. This is further 
complicated by the fact that a number of enzymes are 
involved in starch synthesis, and each of these enzymes is 
itself present in a number of forms, and in a number of 

35 locations within the plant cell. Little, if any, 

information has been available as to which specific form of 
each enzyme is expressed in endosperm. For wheat, a limited 
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amount of nucleic acid sequence information is available., 
but this is only cDNA sequence; no genomic sequence, and 
consequently no information regarding promoters and other 
control sequences, is available. Without being able to 
5 demonstrate that the endosperm-specific gene within a family 
has been isolated, such sequence information is of limited 
practical usefulness . 

SUMMARY OF THE INVENTION 

10 In this application we . report the isolation and 

identification of novel genes from T. tauschii , the D-genome 
donor of wheat, that encode SBE I, SBE II, a 75 kDa SSS I, 
and an isoamylase- type debranching enzyme (DBE) . Because of 
the very close relationship between T. tauschii and wheat, 

15 as discussed above, results obtained with T. tauschii can be 
directly applied to wheat with little if any modification. 
Such modification as may be required represents routine 
trial and error experimentation. Sequences from these genes 
can be used as probes to identify null or altered alleles in 

2 0 wheat, which can then be used in plant breeding programmes 

to provide modifications of starch characteristics. The 
novel sequences of the invention can be used in genetic 
engineering strategies or to introduce a desired gene into a 
host plant, to provide antisense sequences for suppression 
25 of one or more specific genes in a host plant, in order to 
modify the characteristics of starch produced by the plant. 

By using T. tauschii , we have been able to examine 
a single genome, rather than three as in wheat, and to 
identify and isolate the forms of the starch synthesis genes 

3 0 which are expressed in endosperm. By addressing genomic 

sequences we have been able to isolate tissue-specific 
promoters for the relevant genes, which provides a mechanism 
for simultaneous manipulation of a number of genes in the 
endosperm. Because T. tauschii is so closely related to 
3 5 wheat, results obtained with this model system are directly 
applicable to wheat, and we have confirmed this 
experimentally. The genomic sequences which we have 
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determined can also be used as probes for the identification 
and isolation of corresponding sequences, including promoter 
sequences, from other cereal plant species. 

In its most general aspect, the invention provides 
5 a nucleic acid sequence encoding an enzyme of the starch 
biosynthetic pathway in a cereal plant, said enzyme being 
selected from the group consisting of starch branching 
enzyme I, starch branching enzyme II, starch soluble 
synthase I, and debranching enzyme, with the proviso that 

10 the enzyme is not soluble starch synthase I of rice, or 
starch branching enzyme I of rice or maize. 

Preferably the nucleic acid sequence is a DNA 
sequence, and may be genomic DNA or cDNA. Preferably the 
sequence is one which is functional in wheat. More 

15 preferably the sequence is derived from a Triticum species, 
most preferably Triticum tauschii. 

Where the sequence encodes soluble starch 
synthase, preferably the sequence encodes the 75 kD soluble 
starch synthase of wheat. 

20 Biologically-active untranslated control sequences 

of genomic DNA are also within the scope of the invention. 
Thus the invention also provides the promoter of an enzyme 
as defined above. 

In a preferred embodiment of this aspect of the 

25 invention, there is provided a nucleic acid construct 
comprising a nucleic acid sequence of the invention, a 
biologically-active fragment thereof, or a fragment thereof 
encoding a biologically-active fragment of an enzyme as 
defined above, operably linked to one or more nucleic acid 

30 sequences facilitating expression of said enzyme in a plant, 
preferably a cereal plant. The construct may be a plasmid 
or a vector , preferably one suitable for use in the 
transformation of a plant. A particularly suitable vector 
is a bacterium of the genus Agrobacterium, preferably 

35 Agrobacterium tumefaciens. Methods of transforming cereal 
plants using Agrobacterium tumefaciens are known; see for 
example Australian Patent No. 667939 by Japan Tobacco Inc., 
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International Patent Application Number PCT/US97/I0621 by 
Monsanto Company and Tingay et al (1997).. 

In a second aspect, the invention provides a 
nucleic acid construct for targeting of a desired gene to 
5 endosperm of a cereal plant, and/or for modulating the time 
of expression of a desired gene in endosperm of a cereal 
plant, comprising one or more promoter sequences selected 
from SBE I promoter, SBE II promoter, SSS I promoter, and 
DBE promoter, operatively linked to a nucleic acid sequence 

10 encoding a desired protein, and optionally also operatively 
linked to one or more additional targeting sequences and/or 
one or more 3 1 untranslated sequences. 

The nucleic acid encoding the desired protein may 
be in either the sense orientation or in the antisense 

15 orientation. Preferably the desired protein is an enzyme of 
the starch biosynthetic pathway. For example, the antisense 
sequences of GBSS, starch debranching enzyme, SBE II, low 
molecular weight glutenin, or grain softness protein I, may 
be used. Preferred sequences for use in sense orientation 

20 include those of bacterial isoamylase, bacterial glycogen 

synthase, or wheat high molecular weight glutenin Bxl7 . It 
is contemplated that any desired protein which is encoded by 
a gene which is capable of being expressed in the endosperm 
of a cereal plant is suitable for use in the invention. 

25 In a third aspect, the invention provides a method 

of modifying the characteristics of starch produced by a 
plant, comprising the step of: 

(a) introducing a gene encoding a 'desired enzyme 
of the starch biosynthetic pathway into a host plant, and/or 

30 (b) introducing an anti-sense nucleic acid 

sequence directed to a gene encoding an enzyme of the starch 
biosynthetic pathway into a host plant, 

wherein said enzymes are as defined above. 

Where both steps (a) and (b) are used, the enzymes 

35 in the two steps are different. 

Preferably the plant is a cereal plant, more 
preferably wheat or barley. 
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As is well known in the art, anti -sense sequences 
can be used to suppress expression of the protein to which 
the anti-sense sequence is complementary. It will be 
evident to the person skilled in the art that different 
5 combinations of sense and anti-sense sequences may be chosen 
so as to effect a variety of different modifications of the 
characteristics of the starch produced by the plant. 

In a fourth aspect, the invention provides a 
method of targeting expression of a desired gene to the 
10 endosperm of a cereal plant, comprising the step of 

transforming the plant with a construct according to the 
invention . 

According to a fifth aspect, the invention 

provides a method of modulating the time of expression of a 
15 desired gene in endosperm of a cereal plant, comprising the 

step of transforming the plant with a construct according to 

the second aspect of the invention. 

Where expression at an early stage following 

anthesis is desired, the construct preferably comprises the 
20 SBE II, SSS I or DBE promoters. Where expression at a later 

stage following anthesis is desired, the construct 

preferably comprises the SBE I promoter. 

While the invention is described in detail in 

relation to wheat, it will be clearly understood that it is 
25 also applicable to other cereal plants of the family 

Gramineae, such as maize, barley and rice. 

Methods for transformation of monocotyledonous 

plants such as wheat, maize, barley and rice and for 

regeneration of plants from protoplasts or immature plant 
30 embryos are well known in the art. See for example Lazzeri 

et al, 1991; Jahne et al, 1991 and Wan and Lemaux, 1994 for 

barley; Wirtzens et al, 1997; Tingay et al, 1997; Canadian 

Patent Application No. 2092588 by Nehra; Australian Patent 

Application No. 61781/94 by National Research Council of 
35 Canada, Australian Patent No. 667939 by Japan Tobacco Co, 

and International Patent Application Number PCT/US97 / 1062 1 

by Monsanto Company. 
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The sequences of ADP glucose pyrophosphorylase 
from barley (Australian Patent Application No. 65392/94), 
starch debranching enzyme and its promoter from rice 
(Japanese Patent Publication No. Kokai 6261787 and Japanese 
5 Patent Publication No. Kokai 5317057), and starch 

debranching enzyme from spinach and potato (Australian 
Patent Application No. 44333/96) are all known. 

Detailed Description of the Drawings 
10 The invention will be described in detail by 

reference only to the following non-limiting examples and to 
the figures . 

Figure 1 shows the hybridisation of genomic clones 
isolated from T . tauschii . 

15 DNA was extracted from the different clones, 

digested with BamHI and hybridised with the 5' end of the 
maize SBE 1 cDNA. Lanes 1, 2, 3 and 4 correspond to DNA 
from clones XEl , A,E2 , A,E6 and XEl respectively. Note that 
clones XE1 and XE2 give identical patterns, the SBE I gene 

20 in A,E6 is a truncated form of that in XEl , and X.E7 gives a 
clearly different pattern. 

Figure 2 shows the hybridisation of DNA from 
T. tauschii. 

DNA from T. tauschii was digested with BamHI and 
25 the hybridisation pattern compared with DNA from XE1 and A,E7 
digested with the same enzyme. Fragment El . 1 (see Figure 3) 
from XEl was used as the probe; it contains some sequences 
that are over 80% identical to sequences in E7.8. 
Approximately 25 jig of T. tauschii DNA was electrophoresed 
3 0 in lane 1, and 2 00 pg each of A,E1 and XEl in lanes 2 and 3, 
respectively. 

Figure 3 shows the restriction maps of clone XE1 
and XEl . The fragments obtained with EcoRI and BamHI are 
indicated. The fragments sequenced from XEl are El.l, El. 2, 
35 a part of El . 7 and a part of El . 5 . 

Figure 4 shows the comparison of deduced amino 
acid sequence of wSBE I-D4 cDNA with the deduced amino acid 
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sequence of rice SBE I ( RSBE I; Nakamura et al, 1992), maize 
SBE I ( MSBE I; Baba et al, 1991), wSBE I-D2 type cDNA (D2 
cDNA; Rahman et al , 1997), pea SBE II (PESBE II, homologous 
to maize SBE I; Burton et al, 1995), and potato SBE I 
5 (POSBE; Cangiano et al, 1993) . The deduced amino acid 
sequence of the wSBE I-D4 cDNA is denoted by "D4cDNA " . 
Residues present in at least three of the sequences are 
identified in the consensus sequence in capitals. 

Figure 5 shows the intron-exon structure of 
10 wSBE I-D4 compared to the corresponding structures of rice 
SBE I (Kawasaki et al, 1993) and wSBE I-D2 (Rahman et al, 
1997) . The intron-exon structure of wSBE I-D4 is deduced by 
comparison with the SBE I cDNA reported by Repellin et al 
(1997) . 

15 The dark rectangles correspond to exons and the 

light rectangles correspond to introns . The bars above the 
structures indicate the percentage identity in sequence 
between the indicated exons and introns of the relevant 
genes. Note that intron 2 shares no significant sequence 

20 identity and is not indicated. 

Figure 6 shows the nucleotide sequence of part of 
wSBE I-D4, the amino acid sequence deduced from this 
nucleotide sequence, and the N- terminal amino acid sequence 
of the SBE I purified from the wheat endosperm (Morell et 

25 al, 1997) . 

Figure 7 shows the hybridisation of SBE I genomic 
clones with the following probes, 

A. wSBE I-D45 (derived from the 5* end of the 
gene and including sequence from fragments El . 1 and E1.7), 

3 0 and 

B. wSBE I-D43 (derived from the 3' end of the 
gene and containing sequences from fragment El. 5) . For 
panel A, the tracks 1-13 correspond to clones \E1 , \E2 , XE6 , 
XE7, XE9, XE14, \E22, Xe21 , Molecular weight markers, XE29 , 

35 \E30, A.E31 and A,E52 . For panel B, tracks 1-12 correspond to 
clones A.E1, XE2 , XE6 , XEl , XE9 , A.E14, \E22 , XE21 , XE29, 
A.E30, XE31 and \E52 . Note that clones \E1 and A.E22 do not 
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hybridise to either of the probes and are wSBE I-D2 type 
genes. Also note that clone A.E30 contains a sequence 
unrelated to SBE I . The size of the molecular weight 
markers in kb is indicated. Clones XEl and \E22 do 
5 hybridise with a probe from El . 1 . which is highly conserved 
between wSBE I-D2 and wSBE I-D4. 

Figure 8 shows the alignment of cDNA clones to 
obtain the sequence represented by wSBE I-D4 cDNA. BED4 and 
BEDS were obtained from screening the cDNA library with 

10 maize BEI (Baba et al, 1991). BED1 , 2 and 3 were obtained 
by RT-PCR using defined primers. 

Figure 9a shows the expression of Soluble Starch 
Synthase I (SSS) , Starch Branching Enzyme I (BE I) and 
Starch Branching Enzyme II (BE II) mRNAs during endosperm 

15 development. 

RNA was purified from leaves, florets prior to 
anthesis, and endosperm of wheat cultivar Rosella grown in a 
glasshouse, collected 5 to 8 days after anthesis, 10 to 15 
days after anthesis and 18 to 22 days after anthesis, and 

20 from the endosperm of wheat cultivar Rosella grown in the 
field and collected 12, 15 and 18 days after anthesis 
respectively. Equivalent amounts of RNA were 
electrophoresed in each lane. The probes were from the 
coding region of the SM2 SSS I cDNA (from nucleotide 1615 to 

25 1919 of the SM2 cDNA sequence); wSBE I-D43C (see Table I), 
which corresponds to the untranslated 3 1 end of wSBE I-D4 
cDNA (El (3'; and the 5' region of SBE9 (SBE9 (5\), 
corresponding to the region between nucleotides .743 to 10.04 
of Genbank sequence Y11282. No hybridisation to RNA 

30 extracted from leaves or preanthesis florets was detected. 

Figure 9b shows the hybridisation of RNA from the 
endosperm of the hexaploid T. aestivum cultivar "Gabo" with 
the starch branching enzyme I gene. The probe, WSBEI-D43, is 
defined in Table 1. 

3 5 Figure 9c shows the hybridisation of RNA from the 

endosperm of the hexaploid T. aestivxim cultivar "Wyuna" with 
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the starch branching enzyme II gene. The probe, wSBE II-D13, 
is defined in Table 2. 

Figure 9d shows the hybridisation of RNA from the 
endosperm of the hexaploid T. aestivum cultivar "Gabo" with 
5 the SSS I gene. The probe spanned the region from 

nucleotides 2025 to 2497 of the SM2 cDNA sequence shown in 
SEQ ID No: 11 . 

Figure 9e shows the hybridisation of RNA from the 
endosperm of the hexaploid T. aestivwn cultivar u Gabo" with 
10 the DBE I gene. The probe, a DBE3 ' 3 ' PCR fragment, extends 

from nucleotide position 281 to 1072 of the cDNA sequence in 
SEQ ID No: 16 . 

Figure 9f shows the hybridisation of RNA from the 
endosperm of the hexaploid T. aestivum cultivar "Gabo" with 
15 the wheat actin gene. The probe was a wheat actin DNA 

sequence generated by PCR from wheat endosperm cDNA using 
primers to conserved plant actin sequences . 

Figure 9g shows the hybridisation of RNA from the 
endosperm of the hexaploid T. aestivum cultivar "Gabo" with 
20 a probe containing wheat ribosomal RNA 26S and 18S fragments 
(plasmid pta250.2 from Dr Bryan Clarke, CSIRO Plant 
Industry) . 

Figure 9h shows the hybridisation of RNA from the 
hexaploid wheat cultivar "Gabo" with the DBE I probe 

25 described in Figure 9e. Lane 1; leaf RNA; lane 2, pre- 

anthesis floret RNA; lane 3, RNA from endosperm harvested 12 
days after anthesis. 

Figure 10 shows the comparison of wSBE I-D4 
(sr 427. res ck: 6,362,1 to 11,099) and rice SBE I genomic 

30 sequence (dl083 8 . em__pl ck: 3,071,1 to 11 , 700 ) (Kawasaki et 
al, 1993; Accession Number D10838) using the programs 
Compares and DotPlot (Devereaux et al, 1984) . The programs 
used a window of 21 bases with a stringency of 14 to 
register a dot. 

3 5 Figure 11 shows the hybridisation of wheat DNA 

from chromosome-engineered lines using the following probes: 
A. wSBE I-D4 5 (from the 5 1 end of the gene), 



i 
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B. wSBE I-D43 (from the 3' end of the gene), 

and 

C. wSBE I-D4R (repetitive sequence 
approximately 600 bp 3' to the end of wSBE I-D4 sequence. 

5 N7AT7B, no 7A chromosome, four copies of 7B 

chromosome; N7BT7D, no 7B chromosome, four copies of 7D 
chromosome; NTDT7A, no 7D chromosome, four copies of 7A 
chromosome. The chromosomal origin of hybridising bands is 
indicated. 

10 Figure 12 shows the hybridisation of genomic 

clones Fl, F2 , F3 and F4 with the entire SBE-9 sequence. 
The DNA from the clones was purified and digested with 
either BamHI or EcoRI , separated on agarose, blotted onto 
nitrocellulose and hybridised with labelled SBE-9 (a SBE II 

15 type cDNA) The pattern of hybridising bands is different 
in the four isolates . 

Figure 13a shows the N-terminal sequence of 
purified SBE II from wheat endosperm as in Morell et al, 
(1997) . 

20 Figure 13b shows the deduced amino acid sequence 

from part of wSBE II-D1 that encodes the N-terminal sequence 
as described in Morell et al, (1997) . 

Figure 14 shows the deduced exon-intron structure 
for a part of wSBE II-D1. The scale is marked in bases. 
25 The dark rectangles are exons . 

Figure 15 shows the hybridisation of DNA from 
chromosome engineered lines of wheat (cultivar Chinese 
Spring) with a probe from nucleotides 550-850 from SBE-9. 
The band of approximately 2.2 kb is missing in the line in 
3 0 which chromosome 2D is absent. 

T2BN2A: four copies of chromosome 2B, no copies 
of chromosome 2A; 

T2AN2B: four copies of chromosome 2A, no copies 
of chromosome 2B; 
3 5 T2AN2D: four copies of chromosome 2A, no copies 

of chromosome 2D. 
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Figure 16 shows the N- terminal sequence of SSS I 
protein isolated from starch granules (Rahman et al, 1995) 
and deduced amino acid sequence of part of Sm2 . 

Figure 17 shows the hybridisation of genomic 
5 clones sgl, 3,4, 6 and 11 with the cDNA clone (sm2) for SSS 
I. DMA was purified from indicated genomic clones, digested 
with BamHI or Sad and hybridised to sm2 . Note that the 
hybridisation patterns for sgl, 3 and 4 are clearly 
different from each other. 
10 Figure 18 shows a comparison of the intron/exon 

structures of the wheat and rice soluble starch synthase 
genomic sequences. The dark rectangles indicate exons and 
the light rectangles represent introns . 

Figure 19 shows the hybridisation of DNA from 
15 chromosome engineered lines of wheat (cultivar Chinese 
Spring) digested with Pvull, with the sm2 probe. 

N7AT7B: no 7A chromosome, four copies of 7B 
chromosome; 

N7BT7D: no 7B chromosome, four copies of 7D 

2 0 chromosome; 

N7DT7A: no 7D chromosome, four copies of 7A 
chromosome . 

A band is missing in the N7BT7A line. 

Figure 20a shows the DNA sequence of a portion of 
25 the wheat debranching enzyme (WDBE-l)PCR product. The 

PCR product was generated from wheat genomic DNA (cultivar 
Rosella) using primers based on sequences conserved in 
debranching enzymes from maize and rice. 

Figure 2 0b shows a comparison of the nucleotide 

3 0 sequence of wheat debranching enzyme I (WDBE-I) PCR fragment 

(WHEAT. DNA) with the maize Sugary-1 sequence (SUGARY . DNA) . 

Figure 20c shows a comparison between the 
intron/exon structures of wheat debranching enzyme gene and 
the maize sugary-1 debranching enzyme gene. 
35 ' Figure 21a shows the results of Southern blotting 

of T. tauschii DNA with wheat DBE-I PCR product. DNA from 
T. tauschii was digested with BamHI , elec trophoresed, 
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blotted and hybridised to the wheat DBE-I PCR product 
described in Figure 20a. A band of approximately 2 kb 
hybridised . 

Figure 21b shows- Chinese Spring nullisomic/ 
5 tetrasomic lines probed with probes from the DBE gene. Panel 

(I) shows hybridisation with a fragment spanning the region 
from nucleotide 270 to 465 of the cDNA sequence shown in SEQ 
ID No: 16 from the central region of the DBE gene. Panel 

(II) shows hybridisation with a probe from the 3' region of 
10 the gene, from nucleotide 281 to 1072 of the cDNA sequence 

given in SEQ ID No: 16. 

Figures 22a to 22e show diagrammatic 
representations of the DNA vectors used for transient 
expression analysis. In each of the sequences the N-terminal 
15 methionine encoding ATG codon is shown in bold. 

Figure 22a shows a DNA construct pwssslprolgf pNOT 
containing a 1042 base pair region of the wheat soluble 
starch synthase I promoter (wSSSIprol, from -1042 to -1, SEQ 
ID No: 18) fused to the green fluorescent protein (GFP) 
20 reporter gene . 

Figure 22b shows a DNA construct pwsssIpro2gf pNOT 
containing a 3 914 base pair region of the wheat soluble 
starch synthase I promoter (wSSSIpro2 , from -3914 to -1, SEQ 
ID No: 18) fused to the green fluorescent protein (GFP) 

2 5 reporter gene . 

Figure 22c shows a DNA construct psbellprolgf pNOT 
containing an 1203 base pair region of the wheat starch 
branching enzyme II promoter (sbellprol, from 1 to 1023 SEQ 
ID No: 10 fused to the green fluorescent protein (GFP) 

3 0 reporter gene. 

Figure 22d shows a DNA construct psbeIIpro2gf pNOT 
containing a 1353 base pair region of the wheat starch 
branching enzyme 1 1 promoter and trans i t pept ide coding 
region (sbeIIpro2, regions 1-1203, 1204 to 1336 and 1664 to 
35 1680 of SEQ ID No: 10 fused to the green fluorescent protein 
(GFP) reporter gene. 

Figure 2 2e shows a DNA construct pact_j sgf g_nos 
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containing the plasmid backbone of pSP72 (Promega) , the rice 
ActI actin promoter (McElroy et al . 1991), the GFP gene 
(Sheen et al . 1995) and the Agrobacterium tumefaciens 
nopaline synthase (nos) terminator (Bevan et al . 1983). 
5 Figure 23 shows T DNA constructs for stable 

transformation of rice by Agrobac cerium . The backbone for 
each plasmid is p35SH-iC (Wang et al 1997). The various 
promoter-GFP-Nos regions inserted are shown in (a) , (b) , (c) 
and (d) respectively, and are described in detail in Example 

10 24. Each of these constructs was inserted into the NotI 

site of p35SH-iC using the NotI flanking sites at each end. 
of the promoter-GFP-Nos regions. The constructs were named 
(a) p35SH-iC-BEIIprol_GFP_Nos, (b) p3 5SH-iC-BEIIpro2„GFP_Nos 
(c) p35SH-iC-SSIprol_GFP_Nos and (d) p35SH-iC- 

15 SSIpro2_GFP_Nos 

Figure 24 illustrates the design of 15 intron- 
spanning BE II primer sets. Primers were based on 
wSBE II-D1 sequence ( SEQ ID No:10), and were designed such 
that intron sequences in the wSBE II-DI sequence (deduced 

20 from Figure 13b and Nair et al, 1997; Accession No. Y11282) 
were amplified by PGR. 

Figure 25 shows the results of amplification using 
the SBE II-Intron 5 primer set (primer set 6: sr913F and 
WBE2E6 R) on various diploid, tetraploid and hexaploid 

2 5 wheats. 

i) T . boeodicum (A genome diploid) 

ii) T . tauschii (D genome diploid) 

iii) T.aestivum cv . Chinese Spring ditelosomic line 
2 AS (lacking chromosome arm 2AL) 

30 iv) Crete 10 (AABB tetraploid) 

v) T. aestiirum cv Rosella (hexaploid) 
The horizontal axis indicates the size of the 
product in base pairs, the vertical axis shows arbitrary 
fluorescence units. The various arrows indicate the products 
35 of different genomes: A, A genome, B, B genome, D, D genome, 
U, unassigned additional product. 
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Figure 2 6 shows the results obtained by 
amplification using the SBE II-Intron 10 primer set (primer 
set 11: daS.seq and WBE2E11R on the wheat lines: 

(i) T. aestivum cv. Chinese Spring ditelosomic line 
5 2AS. 

(ii) T. aestivum Chinese Spring 
nullisomic/ tetrasomic line N2BT2A. 

(iii) T. aestivxun Chinese Spring 
nullisomic/ tetrasomic line N2DT2B . 

10 The horizontal axis indicates the size of the 

product in base pairs, the vertical axis shows arbitrary 
fluorescence units. The various arrows indicate the products 
of different genomes: A, A genome, B, B genome, D, D genome. 
Figure 27 shows the results of transient 

15 expression assays typical of each promoter and target 
tissue. The photographs (40 x magnification) of 
representative tissue resulting from the transient 
expression assays typical of each promoter and target tissue 
revealed under a Leica microscope with blue light 

20 illumination. Photographs were taken 48 to 72 hours after 
tissue bombardment. The promoter constructs are listed as 
follows, (with the panels showing endosperm, embryo and leaf 
expression listed in respective order): pact_j sgf p_nos 
(panels a,g and m) ; pwssslprolgf pNOT (panels b, h and n) ; 

25 pwsssIpro2gfpNOT (panels c, i and o) ; psbellprolgf pNOT 

(panels d, j and p) ; psbeIIpro2gf pNOT (panels e, k and q) ; 
pZLgfpNOT (Panels f , 1 and r) . 

Example 1 Identification of Gene Encoding SBE I 

3 0 Construction of Genomic Library and Isolation of Clones 

The genomic library used in this study was 
constructed from Triticum tauschii , var. strangulata, 
accession number CPI 100799. Of all the accessions of 
T. tauschii surveyed, the genome of CPI 100799 is the most 
35 closely related to the D genome of hexaploid wheat. 
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Triticum tauschii , var strangulata (CPI accession 
number 110799) was kindly provided by Dr E Lagudah. Leaves 
were isolated from plants grown in the glasshouse. 

DNA was extracted from leaves of Triticum tauschii 
5 using published methods (Lagudah et al, .1991), partially 
digested with Sau3A, size fractionated and ligated to the 
arms of lambda GEM 12 (Promega) . The ligated products were 
used to transfect the methylation- tolerant strain PMC 103 
(Doherty et al . 1992). A total of 2 x 10 6 primary plaques 
10 were obtained with an average insert size of about 15 kb. 
Thus the library contains approximately 6 genomes worth of 
T. tauschii DNA. The library was amplified and stored at 
4°C until required. 

Positive plaques in the genomic library were 
15 selected as those hybridising with the 5' end of a maize 
starch branching enzyme I cDNA (Baba et al , 1991) using 
moderately stringent conditions as described in Rahman et 
al, (1997) . 

2 0 Preparation of Total RNA from Wheat 

Total RNA was isolated from leaves, pre-anthesis 
pericarp and different developmental stages of wheat 
endosperm of the cultivar, Hartog and Rosella. This 
material was collected from both the glasshouse and the 

25 field. The method used for RNA isolation was essentially 

the same as that described by Higgins et al (1976) . RNA was 
then quantified by UV absorption and by separation in 
1.4% agarose- formaldehyde gels which were then visualized 
under UV light after staining with ethidium bromide 

30 (Sambrook et al, 1989) . 

DNA and RNA analysis 

DNA was isolated and analysed using established 
protocols (Sambrook et al, 1989) . DNA was extracted from 

3 5 wheat (cv. Chinese Spring) using published methods (Lagudah 

et al f 1991). Southern analysis was performed essentially 
as described by Jolly et al (1996) . Briefly, 20 |ig wheat 
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DNA was digested, electrophoresed and transferred to a nylon 
membrane. Hybridisation was conducted at 42°C in 25% or 
50% formamide, 2 x SSC, 6% Dextran Sulphate for 16h and the 
membrane was washed at 60°C in 2 x SSC for 3 x Ih unless 
5 otherwise indicated. Hybridisation was detected by 
autoradiography using Fuji X-Omat film. 

RNA analysis was performed as follows. 10 |ig of 
total RNA was separated in a 1.4% agarose- formaldehyde gel 
and transferred to a nylon Hybond N + membrane (Sambrook et 

10 al, 1989 ) , and hybridized with cDNA probe at 42°C in 

Khandjian hybridizing buffer (Khandjian, 1989). The 3' part 
of wheat SBE I cDNA (designated wSBE I-D43, see Table 1) was 
labelled with the Rapid Multiprime DNA Probe Labelling Kit 
(Amersham) and used as probe. After washing at 60°C with 

15 2 x SSC, 0.1% SDS three times, each time for about 1 to 

2 hours, the membrane was visualized by overnight exposure 
at -80°C with X-ray film, Kodak MR. 

Example 2 Frequency of Recovery of SBE I Type Clones 

2 0 from the Genomic Library 

6 

An estimated 2 x 10 plaques from the amplified 
library were screened using an EcoRI fragment that contained 
1200 bp at the 5' end of maize SBE I (Baba et al, 1991) and 
twelve independent isolates were recovered and purified. 
25 This corresponds to the screening of somewhat fewer than the 
2 x 10 6 primary plaques that exist in the original library 
(each of which has an average insert size of 15 kb) 
(Maniatis.et al, 1982), because the amplification may lead 
to the representation of some sequences more than others. 
30 Assuming that the amplified library contains approximately 
three genomes of T. tauschii , the frequency with which 
SBE I -positive clones were recovered suggests the existence 
of about 5 copies of SBE I type genes within the T. tauschii 
genome . 

3 5 Digestion of DNA from the twelve independent 

isolates by the restriction endonuclease Ba/nHI followed by 
hybridisation with a maize SBE I clone, suggested that the 
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genomic clones could be separated into two broad classes 
(Figure 1) . One class had 10 members and a representative 

from this class is the clone XE1 (Figure 1, lane 1); XE6 
(Figure 1, lane 3) is a member of this class, but is missing 
5 the 5 ! end of the El-SBE I gene because the SBE I gene is at 

the extremity of the cloned DNA. Further hybridisation 

studies at high stringency with the extreme 5' and 

3 ' regions of the SBE I gene contained in A.E1 suggested that 
the other clones contained either identical or very closely 

10 related genes . 

The second family had two members, and of these 
clone XE1 (Figure 1, lane 4) was arbitrarily selected for 
further study. These two members did not hybridise to 
probes from the extreme 5' and 3' regions of the SBE I gene 

15 that were contained in A.E1, indicating that they were a 
distinct sub-class . 

The DNA from T. tauschii and the lambda clones X.E1 
and A.E7 was digested with BamHI and hybridised with 
fragment El.l, as shown in Figure 2. This fragment contains 

20 sequences that are highly conserved (85% sequence identity 
over 0.3 kB between XE1 and \E1), corresponding to exons 3, 

4 and 5 of the rice gene. The bands in the genomic DNA at 
0.8 kb and 1.0 kb correspond to identical sized fragments 
from A.E1 and \El , as shown in Figure 2; these are 

25 fragments El.l and E7 . 8 of Xe\ and XE1 genomic clones 

respectively. Thus the arrangement of genes in the genomic 
clones is unlikely to be an artefact of the cloning 
procedure. There are also bands in the genomic DNA of 
approximately 2.5 kb, 4 . 8 kb and 8 kb in size which are not 

30 found from the digestion of A.E1 or XE1 ; these could 

represent genes such as the 5* sequences of wSBE I-Dl or 
wSBE I-D3; see below. 

Example 3 Tandem Arrangement of SBE I Type Genes in 

3 5 the T. tauschii Genome 

Basic restriction endonuclease maps for XE1 and 
XEl are shown in Figure 3 . The map was constructed by 
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performing a series of hybridisations of EcoRX or SamHI 
digested DNA from XE1 or XE1 . The probes used were the 
fragments generated from BamHI digestion of the relevant 
clone. Confirmation of the maps was obtained by PCR 
5 analysis, using primers both within the insert and also from 
the arms of lambda itself. PCR was performed in 10 [il 
volume using reagents supplied by Perkin-Elmer . The primers 
were used at a concentration of 20 |1M. The program used was 
94°C, 2 min, 1 cycle, then 94°C, 30 sec; 55°C, 30 sec; 72°C, 

10 lmin for 3 6 cycles and then 72°C, 5 min; 2 5°C, 1 min. 

Sequencing was performed on an ABI sequencer using 
the manufacturer's recommended protocols for both dye primer 
and dye terminator technologies . Deletions were carried out 
using the Erase-a-base kit from Promega . 

15 Sequence analysis was carried out using the GCG 

version 7 package of computer programs (Devereaux et al, 
1984) . 

The PCR products were also used as hybridisation 
probes. The positioning of the genes was derived from 

20 sequencing the ends of the SamHI subclones and also from 

sequencing PCR products generated from primers based on the 
insert and the lambda arms. The results indicate that there 
is only a single copy of a SBE I type gene within \E1 . 
However, it is clear that X.E7 resulted from the cloning of a 

25 DNA fragment from within a tandem array of the SBE I type 

genes. Of the three genes in the clone, which are named as 
wSBE I-Dl, wSBE I-D2 and wSBE I-D3); only the central one 
(wSBE I-D2) is complete. 

3 0 Example 4 Construction and Screening of cDNA Library 

A wheat cDNA library was constructed from the 
cultivar Rosella using pooled RNA from endosperm at 8, 12, 
18 and 20 days after anthesis. 

The cDNA library was prepared from poly A + RNA 
3 5 that was extracted from developing wheat grains (cv. 

Rosella, a hexaploid soft wheat cultivar) at 8, 12, 15, 18, 
21 and 30 days after anthesis. The RNA was pooled and used 
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to synthesise cDNA that was propagated in lambda ZapII 
(Stratagene) . 

The library was screened with a genomic fragment 
from XE1 encompassing exons 3, 4 and 5 (fragment E7 . 8 in 
5 Figure 3) . A number of clones were isolated. Of these an 
apparently full-length clone appeared to encode an unusual 
type of cDNA for SBE I. This cDNA has been termed SBE I-D2 
type cDNA. The putative protein product is compared with 
the maize SBE I and rice SBE I type deduced amino acid 

10 sequences in Figure 4. The main difference is that this 

putative protein product is shorter at the C-terminal end, 
with an estimated molecular size of approximately 74 kD 
compared with 85 kDa for rice SBE I (Kawasaki et al, 1993). 
Note that amino acids corresponding to exon 9 of rice are 

15 missing in SBE I-D2 type cDNA, but those corresponding to 
exon 10 are present. There are no amino acid residues 
corresponding to exons 11-14 of rice; furthermore, the 
sequence corresponding to the last 57 amino acids of 
SBE I-D2 type has no significant homology to the sequence of 

20 the rice gene. 

We expressed SBE I-D2 type cDNA in E . coli in 
order to examine its function. The cDNA was expressed as a 
fusion protein with 22 N-terminal residues of (5-galacto- 
sidase and two threonine residues followed by the SBE I-D2 

25 cDNA sequence either in or out of frame. Although an 

expected product of about 75 kDa in size was produced from 
only the in- frame fusion, we could not detect any enzyme 
activity from crude extracts of E. coli protein. 
Furthermore the in-frame construct could not complement an 

30 E. coli strain with a defined deletion in glycogen 

branching, although other putative branching enzyme cDNAs 
have been shown to be functional by this assay (data not 
shown) . It is therefore unclear whether the wSBE I-D2 gene 
in XE1 codes for an active enzyme in vivo. 



35 
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Example 5 Gene Structure in E7 

i» Sequence of wSBE I-D2 

We sequenced 9.2 kb of DNA that contained 
wSBE I-D2. This corresponds to fragments 7.31, 7.8 and 
5 7.18. Fragment 7.31 was sequenced in its entirety (4.1 kb) , 
but the sequence of about 3 0 bases about 2 kb upstream of 
the start of the gene could not be obtained because it was 
composed entirely of Gs . Elevation of the temperature of 
sequencing did not overcome this problem. Fragments 7.8 

10 (1 kb) and 7.18 (4 kb) were completely sequenced, and 

corresponded to 2 kb downstream of the last exon detected 
for this gene. It was clear that we had isolated a gene 
which was closely related (approximately 95% sequence 
identity) to the SBE I-D2 type cDNA referred to above, 

15 except that the last 200 bp at the 3' end of the cDNA are 
not present. The wSBE I-D2 gene includes sequences 
corresponding to rice exon 11 which are not in the cDNA 
clone. In addition it does not have exons 9, 12, 13 or 14; 
these are also absent from the SBE I-D2 type cDNA. The 

20 first two exons show lower identity to the corresponding 

exons from rice (approximately 60%) (Kawasaki et ai, 1993) 
than to the other exons (about 80%) . A diagrammatic exon- 
intron structure of the wSBE I-D2 gene is indicated in 
Figure 5. The restriction map was confirmed by sequencing 

25 the PCR products that spanned fragments 7.18 and 7.8 and 7.8 
and E7.31 (see Figure 3) respectively. 

ii. Sequence of wSBE I-D3 

This gene was not sequenced in detail, as the 
30 genomic clone did not extend far enough to include the 5' 

end of the sequence. The sequence is of a SBE-I type. The 
orientation of the gene is evident from sequencing of the 
relevant BamHl fragments, and was confirmed by sequence 
analysis of a PCR product generated using primers from the 
3 5 right arm of lambda and a primer from the middle of the 

gene. The sequence homology with WSBEI-D2 is about 80% over 
the regions examined. The 2 kb sequenced corresponded to 
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exons 5 and 6 of the rice gene; these sequences were 
obtained by sequencing the ends of fragments 7.5, 7.4 and 
7.14 respectively, although the sequences from the left end 
of fragment 7 . 14 did not show any homology to the rice 
5 sequences . The gene does not appear to share the 3 r end of 
SBE I-D2 type cDNA, as a probe from 500 bp at the 3' end of 
the cDNA (including sequences corresponding to exons 8 and 
10 from rice) did not hybridise to fragment 7.14, although 
it hybridised to fragment 7.18. 

10 

iii. Sequence of wSBE I-Dl 

This gene was also not sequenced in detail, as it 
was clear that the genomic clone did not extend far enough 
to include the 5' sequences. Limited sequencing suggests 

15 that it is also a SBE I type gene. The orientation relative 
to the left arm of lambda was confirmed by sequencing a PCR 
product that used a primer from the left arm of lambda and 
one from the middle of the gene (as above) . Its sequence 
homology with wSBE I-D2 , D3 and D4 (see below) is about 75% 

20 in the region sequenced corresponding to a part of exon 4 of 
the rice gene. 

Starch branching enzymes are members of the a- 
amylase protein family, and in a recent survey Svensson 
(1994) identified eight residues in this family that are 

25 invariant, seven in the catalytic site and a glycine in a 
short turn. Of the seven catalytic residues, four are 
changed in SBE I-D2 type. However, additional variation in 
the % conserved' residues may come to light when more plant 
cDNAs for branching enzyme I are available for analysis. In 

30 addition, although exons 9, 11, 12, 13 and 14 from rice are 
not present in the SBE I-D2 type cDNA, comparison of the 
maize and rice SBE I sequences indicate that the 3' region 
(from amino acid residue 730 of maize) is much more variable 
than the 5* and central regions. The active sites of rice 

35 and maize SBE I sequences, as indicated by Svensson (1994), 
are encoded by sequences that are in the central portion of 
the gene. When SBE II sequences from Arabidopsis were 
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compared by Fisher et al (1996) they also found variation at 
the 3' and 5' ends. SBE I-D2 type cDNA may encode a novel 
type of branching enzyme whose activity is not adequately 
detected in the current assays for detecting branching 
5 enzyme activity; alternatively the cDNA may correspond to an 
endosperm mRNA that does not produce a functional protein. 

Example 6 Cloning of the cDNA corresponding to the 

wSBE I-D4 gene 

10 The first strand cDNAs were synthesized from 1 |lg 

of total RNA, derived from endosperm 12 days after 
pollination, as described by Sambrook et al (1989), and then 
used as templates to amplify two specific cDNA regions of 
wheat SBE I by PCR . 

15 Two pairs of primers were used to obtain the cDNA 

clones BED1 and BED3 (Table 1) . Primers used for cloning of 
BED3 were the degenerate primer NTS 5 ' 

5' GGC NAG NGC NGA G/AGA C/TGG 3' (SEQ ID NO . 1 ) , 

20 

based on the N-terminal sequence of the purified 
wheat endosperm SBE I protein, in which the 5 1 end of the 
primer is at position 168 of wSBE I-D4 cDNA, as shown in 
Table 1, based on the N-terminal sequence of wheat SBE I, 
2 5 and the primer NTS 3 1 . 

5' TAC ATT TCC TTG TCC ATCA 3' (SEQ ID NO . 2 ) 

in which the 5* end is at position 1590 of 
30 wSBE I-D4 cDNA, (see Table 1) , designed to anneal to the 
conserved regions of the nucleotide sequences of BEDS and 
the maize and rice SBE I cDNAs . For clone BED1 , the 
primers used were BEC5 ' 



35 



5 ? ATC ACG AGA GCT TGC TCA 



(SEQ ID NO. 3) 
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in which the 5' end is at position 1 of wSBE I-D4 
cDNA (see Table 1); the sequence was based on the wSBE I-D4 
• gene, and BEC3 1 

5 5' CGG TAC AC A GTT GCG TCA TTT TC 3' (SEQ ID NO . 4 ) 

in which the 5 1 end is at position 334 of 
wSBE I-D4 cDNA (see Table 1) , and the sequence was based on 
BED 3 . 

10 

Example 7 Identification of the gene from the Triticum 

tauschii SBE I family which is expressed in 
the endosperm 
We have isolated two classes of SBE I genomic 

15 clones from T . tauschii. One class contained two genomic 
clone isolates, and this class has been characterised in 
some detail (Rahman et al, 1997) . The complete gene 
contained within this class of clones was termed wSBE I-D2; 
there were additional genes at either ends of the clone, and 

20 these were designated wSBE I-Dl and wSBE I-D3. The other 
class contained nine genomic clone isolates. Of these XE1 
was arbitrarily taken as a representative clone, and its 
restriction map is shown in Figure 3; the SBE I gene 
contained in this clone was called wSBE I-D4. 

2 5 Fragments El . 1 (0.8 kb) and El . 2 (2.1 kb) and 

fragments El. 7 (4.8 kb) and El . 5 (3 kb) respectively were 
completely sequenced. Fragment El. 7 was found to encode the 
N- terminal of the SBE I, which is found in the endosperm as 
described in Morell et al (1997) . This is shown in 

30 Figure 6. Using antibodies raised against the N-terminal 
sequence, Morell et al (19 97) found that the D genome 
isoform was the most highly expressed in the cultivars 
Rosella and Chinese Spring. We have thus isolated from 
T . tauschii a gene, wSBE I-D4, whose homologue in the 

35 hexaploid wheat genome encodes the major isoform for SBE I 
that is found in the wheat endosperm. 



WO 99/14314 



PCT/AU 98/00743 



- 23 - 

Table 1 

Location of structural features and probes within wSBE I-D4 

sequence • 



5 A. Location of exons by comparison with the cDNA sequence of 
Repellin et al . , (1997). Accession number Y12320. 





Exon number Start posn 


End posn 




10 


1 4890 


4987 






2 5082 


5149 






3 5524 


5731 






4 5819 


5888 






5 6149 


bi Its 




15 


6 6519 


7424 






7 7744 


7860 






8 8015 


8077 






9 8562 


8670 






10 9137 


9237 




20 


11 9421 


9488 






12 9580 


9661 






13 9781 


9897 






14 9990 


10480 




25 


B. Other features. 








Name of feature. 


wSBE I-D4. 


D4 cDNA 






sequence 


sequence . 


30 


Putative initiation of translation 


4900 


11 




Mature N-terminal sequence of SBE I 


5550 


124 




End of translated SBE I sequence 


10225 


2431 




End of D4 cDNA sequence 


10461 


2687 




wSBE I-D45 


4870, 5860 


1, 354 


35 


wSBE I-D43 


10116, 10435 


2338, 2657 




El. 1 


5680, 6400 


380, 630 




BED 1 




1, 354 




BED 2 




169, 418 




BED 3 




151, 1601 


40 


BED 4 




867, 2372 




BED 5 




867, 2687 




Endosperm box like motif TGAAAAGT 


4480, 590 






CAAAT motif 


4863 






TAT AAA motif 


4833 
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All nine genomic clones of the XE1 type isolated 
from T. tauschii appear to contain the wSBE I-D4 gene, or 
very similar genes, on the basis of PCR amplification and 
hybridisation experiments. However, the restriction 
5 patterns obtained for the clones differ with BawHI and 
EcoRI, among other enzymes, indicating that either the 
clones represent near-identical but distinct genes or they 
represent the same gene isolated in distinct products of the 
Sau3A digest used to generate the library. 

10 

Example 8 Investigation of other SBE 1 genomic clones 

isolated 

All ten members of the XEl-like class of SBE I 
genomic clones were investigaced by hybridisation with 

15 probes derived from fragment El . 7 (sequence wSBE I-D45, 
encoding the translation start signal and the first 
100 amino acids from the N-terminal end and intron 
sequences; see Table 1) and from fragment El . 5 (sequence 
wSBE I-D43, corresponding largely to the 3' untranslated 

20 sequence and containing intron sequences, see Table 1). The 
results obtained were consistent with one type of gene being 
isolated in different fragments in the different clones, as 
shown in Figure 7 . The PCR products were obtained from the 
clones A.E1, 2, 9, 14, 27, 31 and 52. These hybridised to 

25 wSBE I-D45 using primers that amplify near the 5* end of the 
gene (positions 5590-6162 of wSBE I-D4) . Sequencing showed 
no differences in sequence of a 200 bp product. 

Analysis of the promoter for wSBE I-D4 allows us 
to investigate the presence of motifs previously described 

30 for promoters that regulate gene expression in the 

endosperm. Forde et al (1985) compared prolamin promoters, 
and suggested that the presence of a motif approximately 
-300 bp upstream of the transcription start point, called 
the endosperm box, was responsible for endosperm-specific 

3 5 expression. The endosperm box was subsequently considered 

to consist of two different motifs: the endosperm motif (EM) 
(canonical sequence TGTAAAG) and the GCN 4 motif (canonical 
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sequence G / ATG AG / CTC AT ) . The GCN4 box is considered to 
regulate expression according to nitrogen availability 
(Muller and Knudsen, 1993) . The wSBE I-D4 promoter contains 
a number of imperfect EM-like motifs at approximately -100 , 
5 -300 and -400 as well as further upstream. However, no GCN4 
motifs could be found, which lends support to the idea that 
this motif regulates response to nitrogen, as starch 
biosynthesis is not as directly dependent on the nitrogen 
status of the plant as storage protein synthesis. Comparison 

10 of the promoters for wSBE I-D4 and D2 (Rahman et al, 1997) 
indicates that although there are no extensive sequence 
homologies there is a region of about 100 bp immediately 
before the first encoded methionine where the homology is 
61% between the two promoters. In particular there is an 

15 almost perfect match in the sequence over twenty base pairs 
CTCGTTGCTTCC / TACTCCACT , (positions 4723-4742 of the wSBE I 
sequence) , but the significance of this is hard to gauge, as 
it does not occur in the rice promoter for SBE I. The 
availability of more promoters for starch biosynthetic 

20 enzymes may allow firmer conclusions to be drawn. There are 
putative CAAT and TATA motifs at positions 4870 and 4830 
respectively of wSBE I-D4 sequence. The putative start of 
translation of the mRNA is at position 4900 of wSBE I-D4. 

Figure 5 shows the structure of the wSBE I-D4 

2 5 gene, compared with the genes from rice and wheat (Kawasaki 
et al, 1993; Rahman et al, 1997) . The rice SBE I has 14 
exons compared with 13 for wSBE I-D4 and 10 for wSBE I-D2. 
There is good conservation of exon-intron structure between 
the three genes, except at the extreme 5* end. In particular 

30 the sizes of intron 1 and intron 2 are very different 
between rice SBE I and wSBE I-D4. 

Example 9 Isolation of cDNA for SBE I 

Using the maize starch branching enzyme I cDNA as 
35 a probe (Baba et al, 1991), 10 positive plaques were 

recovered by screening approximately 10 5 plaques from a 
wheat endosperm cDNA library prepared from the cultivar 
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Rosella, as described in Example 4. On purifying and 
sequencing these plaques it was clear that even the longest 
clone (BEDS , 1822 bp) did not encode the N- terminal sequence 
obtained from protein analysis. Degenerate primers based on 
5 the wheat endosperm SBE I protein N-terminal sequence 

(Morell et al, 1997) and the sequence from BEDS were then 
used to amplify the 5 1 region: this produced a cDNA clone 
termed BED 3 (Table 1 and Figure 8) . This cDNA clone 
overlapped extensively and had 100% sequence identity with 

10 BEDS and BED4 (Figure 8) . As almost the entire protein N- 
terminal sequence had been included in the primer sequence 
design, this did not provide independent evidence of the 
selection of a cDNA sequence in the endosperm that encoded 
the protein sequence of the main form of SBE I. Using a 

15 BED3 to screen a second cDNA library produced BED2 , which is 
shorter than BED3 but confirmed the BED3 sequence at 100% 
identity between positions 169 and 418 (Figure 8 and 
Table 1) . In addition the entire cDNA sequence for BED3 
could be detected at a 100% match in the genomic clone X.E1 . 

20 Primers based on the putative transcription start point 
combined with a primer based on the incomplete cDNAs 
recovered were then used to obtain a PCR product from total 
endosperm RNA by reverse transcription. This led to the 
isolation of the cDNA clone, BED1 , of 300 bp, whose location 

25 is shown in Figure 8. By analysing this product, a sequence 
was again obtained that could be found exactly in the 
genomic clone A,E1, and which overlapped precisely with BED3 . 

The N-terminal of the protein matches that of 
SBE I isolated from wheat endosperm by Morell et al (1997) , 

3 0 and thus the wSBE I-D4 cDNA represents the gene for the 

. predominant SBE I isoform expressed in the endosperm. The 
encoded protein is 87 kDa; this is similar to proteins 
encoded by maize (Baba et al, 1991) and rice (Nakamura et 
al, 1992) cDNAs for SBE I and is distinct from the wSBE I-D2 

3 5 cDNA described previously, in which the encoded protein was 
74 kDa (Rahman et al, 1997) . 
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Five cDNA clones were sequenced and their 
sequences were assembled into one contiguous sequence using 
a GCG program (Devereaux et ai, 1984) . The arrangement of 
these sequences is illustrated in Figure 8, the nucleotide 
5 sequence is shown in SEQ ID No : 5 , and the deduced amino acid 
sequence is shown in SEQ ID No ; 6 . The intact cDNA sequence, 
wSBE I-D4 cDNA , is 2 687 bp and contains one large open 
reading frame ( ORF ) , which starts at nucleotides 11 to 13 
and ends at nucleotides 2432 to 2434. It encodes a 

10 polypeptide of 807 amino acids with a molecular weight of 
87 kDa . Comparison of the amino acid sequence encoded by 
wSBE I-D4 cDNA with that encoded by maize and rice SBE I 
cDNAs showed that there is 75-80% identity between any of 
two these sequences at the nucleotide level and almost 90% 

15 at the amino acid level. Alignment of these three 

polypeptide sequences, as shown in Figure 4, along with the 
deduced sequences for pea, potato and wSBE I-D2 type cDNA, 
indicated that the sequences in the central region are 
highly conserved, and sequences at the 5 1 end (about 

20 80 amino acids) and the 3 1 end (about 60 amino acids) are 
variable . 

Svensson et al (1994) indicated that there were 
several invariant residues in sequences of the a-amylase 
super-family of proteins to which SBE I belongs. In the 

25 sequence of maize SBE I these are in motifs commencing at 
amino acid residue positions 341, 415, 472, 537 
respectively; these are also encoded in the wSBE I-D4 
sequence (SEQ ID No : 9 ) , further supporting the view that 
this gene encodes a functional enzyme. This is in contrast 

3 0 to the results with the wSBE I-D2 gene, where three of the 
conserved motifs appear not to be encoded (Rahman et al, 
1997) . 

There is about 9 0% sequence identity in the 
deduced amino acid sequence between wSBE I-D4 cDNA and rice 
3 5 SBE I cDNA in the central portion of the molecule (between 

residues 160 and 740 for the deduced amino acid product from 
wSBE I-D4 cDNA) . The sequence identity of the deduced amino 



WO 99/14314 



PCT/AU98/00743 



acid sequence of the wSBE I~D4 cDNA to the deduced amino 
acid sequence of wSBE I-D2 is somewhat lower (85% for the 
most conserved region, between residues 285 to 390 for the 
deduced product of wSBE I-D4 cDNA) . Surprisingly, however, 
5 wSBE I-D4 cDNA is missing the sequence that encodes amino 
acids at positions 30 to 58 in rice SBE I (see Figure 4) . 
This corresponds to residues within the transit peptide of 
rice SBE I. A corresponding sequence also occurs in the 
deduced amino acid sequence from maize SBE I (Baba et al, 

10 1991) and wSBE I-D2 type cDNA (Rahman et al, 1997) . 

Consequently the transit sequence encoded by wSBE I-D4 cDNA 
is unusally short, containing only 3 8 amino acids, compared 
with 55-60 amino acids deduced for most starch biosynthetic 
enzymes in cereals (see for example Ainsworth, 1993; Nair et 

15 al, 1997) . The wSBE I-D4 gene does contain this sequence, 
but this does not appear to be transcribed into the major 
species of RNA from this gene, although it can be detected 
at low relative abundance. This raises the possibility of 
alternative splicing of the wSBE I-D4 transcript, and also 

20 the question of the relative efficiency of 

translation/transport of the two isoforms. The possibility 
of alternative splicing in both rice and wheat has been 
considered for soluble starch synthase (Baba et al, 1993 
Rahman et al, 1995) . Alternative splicing of soluble starch 

25 synthase would give a transit sequence of 40 amino acids, 
which is the same length proposed for the product of 
wSBE I-D4 cDNA. 

We have previously used probes based on exons 4, 5 
and 6 (E7.8 and El.l, see Rahman et al . , 1997) of wSBE-D2 to 

30 probe wheat and T. tauschii genomic DNA cleaved with PvuII 
and BamHI respectively. This region is highly conserved 
within rice SBE X, wSBE I-D2 and wSBE I-D4 and produced ten 
bands with wheat DNA and five with T. tauschii DNA . Neither 
PvuII nor BamHI cleaved within the probe sequences, 

3 5 suggesting that each band represented a single type of SBE I 
gene. We have described four SBE I genes from T. tauschii: 
wSBE I-Dl, wSBE I-D2, wSBE J-D3 and wSBE I-D4 (Rahman et al, 
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1997 and this specification) , and so we may have accounted 
for most of the genes in T. tauschii and, by extension, the 
genes from the D genome of wheat. In wheat, at least two 
hybridising bands could be assigned to each of 
5 chromosomes 7A, 7B and 7D. 

Example 10 Tissue specificity and expression during 

endosperm development 
The 300 bp of 3 f untranslated sequence of 

10 wSBE I-D4 cDNA does not show any homology with either the 
wSBE I-D2 type cDNA that we have described earlier (Rahman 
et al, 1997) or with BE-I from rice, as shown in Figure 5. 
We have called this sequence wSBE I-D43C (see SEQ ID No : 9 ) . 
It seemed likely that wSBE I-D43C would be a specific probe 

15 for this class of SBE-I, and thus it was used to investigate 
the tissue specificity. Hybridization of RNA from endosperm 
of hexaploid T. tauschii cultures with SBE I, SBE II, SSS I, 
DBE I, wheat act in, and wheat ribosomal RNA was examined. 
RNA was purified at various numbers of days after anthesis 

20 from plants grown with a 16 h photoperiod at 13 °C (night) 
and 18 °C (day) . The age of the endosperms from which RNA 
was extracted in days after anthesis is given above the 
lanes in the blot. Equivalent amounts of RNA were 
electrophoresed in each lane. The probes used are identified 

25 in Tables 1 and 2. 

The results are shown in Figures 9a to 9g. An RNA 
species of about 2700 bases in size was found to hybridise. 
This is very close to the size of the wSBE I-D4 cDNA 
sequence. RNA hybridising to WSBE-I-D43C is most abundant 

3 0 at the mid-stage of endosperm development, as shown in 
Figure 9a, and in field grown material is relatively 
constant during the period 12-18 days, the time at which 
there is rapid starch and storage protein accummulation 
(Morell et al, 1995) . 

3 5 The sequence contained within the wSBE I-D4 gene 

appears to be expressed only in the endosperm (Figure 9a, 
Figure 9b) . We could not detect any expression in the leaf. 
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This could be because another isoform is expressed in the 
leaf, and/or because the amount of SBE I present in the leaf 
is much less than what is required in the endosperm. 
Isolation of SBE I clones from a leaf cDNA library would 
5 enable this question to be resolved. 

Example 11 Intron-Exon Structure of SBE I 

By comparison of the cDNA sequence of SBE I 
(Repellin et al, 1997) with that of wSBE I-D4 we can deduce 

10 the intron-exon structure of the gene for. the major isoform 
of SBE I that is found in the endosperm. The structure 
contains 14 exons compared to 14 for rice (Kawasaki et al, 
1993) . These 14 exons are spread over 6 kb of sequence, a 
distance similar to that found in both rice SBE I and 

15 wSBE I-D2. A dotplot comparison of wSBE I-D4 sequence and 
that of rice SBE I sequence, depicted in Figure 10, shows 
good sequence identity over almost the entire gene starting 
from about position 5100 of wSBE I-D4; the identity is poor 
over the first 5 kb of sequence corresponding largely to the 

20 promoter sequences. The sequence identity over introns 
(about 60%) is lower than over exons (about 85%) . 

Example 12 Repeated Sequences in SBE I 

Sequencing of wSBE I-D4 revealed there was a 

25 repeated sequence of at least 300 bp contained in a 2kb 

fragment about 600 bp after the 3' end of the gene. We have 
called this sequence wSBE I-D4R (SEQ ID NO: 9) . This 
repeated sequence is within fragment El . 5 (Figure 3 and 
Table 1) and is flanked by non-repetitive sequences from the 

30 genomic clone. We have previously shown that the 

restriction pattern obtained by digesting XE1 with the 
restriction enzyme BowHI is also obtained when T . tauschii 
DNA is digested. Thus wSBE I-D4R is unlikely to be a 
cloning artefact. A search of the GenBank Database revealed 

35 that wSBE I-D4R shared no significant homology with any 

sequence in the database. Hybridisation experiments with 
wSBE I-D4R showed that all of the other SBE I-D4 type 
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genomic clones (except number 29) contained this repeated 
sequence (data not shown) . The wSBE I-D4R sequence was not 
highly repeated and occurred in the wheat genome with a 
similar frequency as the viSBE I-D4 sequence. 
5 When SBE I-D4R was used as the probe on wheat DNA 

from the nulli-tetra lines , four bands were obtained; two of 
these bands could be assigned to chromosome 7 A and the 
others, to chromosomes 7B and 7D (Figure 11) . One of the two 
BamHI fragments from wheat DNA which could be assigned to 

10 chromosome 7A was distinct from the single band from 

chromosome 7 A detected using wSBE I-D4 3 as the probe; the 
other three bands coincided in the autoradiograph with bands 
obtained with wSBE I-D43, and are likely to represent the 
same fragment . However , one of these fragments was distinct 

15 from the BamHI fragment that hybridised to the wSBE I-D43 
sequence. In wSBE I-D4 (see SEQ ID No: 9), the wSBE I-D43 
sequence is only 3 00 bp upstream of wSBE I-D4R, and occurs 
in the same BawHZ fragment. These results suggest that the 
wSBE I-D4R sequence can occur independently of wSBE I-D4 in 

2 0 the wheat genome. 

Example 13 Isolation of Genomic Clones Encoding SBE II 

Screening of a cDNA library, prepared from the 
wheat endosperm as described in Example 4, with the maize 
25 BE I clone (Baba et al, 1991) at low stringency led to the 

isolation of two classes of positive plaques. One class was 
strongly hybridising, and led to the isolation of wheat 
SBE I-D2 type and SBE I-D4 type cDNA clones, as described in 
Example 5 and in Rahman et al (1997) . The second class was 

3 0 weakly hybridising, and one member of this class was 

purified. This weakly hybridising clone was termed SBE-9, 
and on sequencing was found to contain a sequence that was 
distinct from that for SBE I. This sequence showed greatest 
homology to maize BE II sequences, and was considered to 
3 5 encode part of the wheat SBE II sequence. 

The screening of approximately 5 x 10 5 plaques 
from a genomic library constructed from T. tauschii (see 
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Example 1) with the SBE-9 sequence led to the isolation of 
four plaques that were positive. These were designated 
wSBE JJ-D1 to wSBE II-D4 respectively, and were purified and 
analysed by restriction mapping. Although they all had 
5 different hybridization patterns with SBE-9 , as shown in 

Figure 12, the results were consistent with the isolation of 
the same gene in different-sized fragments. 

Example 14 Identification of the N-terminal sequence of 

10 SBE II 

Sequencing of the SBE II gene contained in 
clone 2, termed SBE II-D1 (see SEQ ID No:10), showed that it 
coded for the N- terminal sequence of the major isoform of 
SBE II expressed in the wheat endosperm, as identified by 

15 Morell et al (1997). This is shown in Figure 13. 

Example 15 Intron-Exon Structure of the SBE II Gene 

In addition to encoding the N-terminal sequence of 
sBE II, as shown in Example 10, the cDNA sequence reported 
20 by Nair et al (1997) was also found to have 100% sequence 

identity with part of the sequence of wSBE II-D1. Thus the 
intron-exon structure can be deduced, and this is shown in 
Figure 14. The positions of exons and other major structural 
features of the SBE II gene are summarized in Table 2. 

25 

Example 16 Number of SBE II Genes in T. tauschii and 

Wheat 

Hybridisation of the SBE II conserved region with 
T. tauschii DNA revealed the presence of three gene classes. 
30 However, in our screening we only recovered one class. 

Hybridisation to wheat DNA indicated that the locus for' 
SBE II was on chromosome 2, with approximately 5 loci in 
wheat; most of these appear to be on chromosome 2D, as shown 
in Figure 15. 

35 
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Table 2 

Positions of structural features in wSBE II-D1. 

5 A. Positions of exons . 



number 


Genomic 


Genomic 




start 


finish 


1 


1058 


1336 


2 


1664 


1761 


3 


2038 


2279 


4 


2681 


2779 


5 


2949 


2997 


6 


3145 


3204 


7 


3540 


3620 


8 


3704 


3825 


9 


4110 


4188 


10 


4818 


4939 


11 


5115 


5234 


12 


6209 


6338 


13 


6427 


6549 


14 


6739 


6867 


15 


7447 


7550 


16 


8392 


8536 


17 


9556 - 


9703 


18 


9839 


9943 


19 


10120 


10193 


20 


10395 


10550 


21 


10928 


11002 


22 


11092 


11475 



B. Other structural features within che wSBE II-D1 DNA 
sequence 



35 



Putative initiation of translation 1214 

Mature N-terminal sequence of SBE II. 1681 

wSBE II-D13 11116 to 11448 

Endosperm box like motif TGAAAAGT 521 

40 Endosperm box like motif TGAAAGT 565 

Endpsperm box like motif CGAAAAT 669 

Endosperm box like motif TAAATGT 768 

CAAAAT motif 784 

TCAATT motif 1108 

4 5 TATAAA motif 799 

AATTAA motif 1110 
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Example 17 Expression of SBE II 

Investigation of the pattern of expression of 
SBE II revealed that the gene was only expressed in the 
endosperm. However the timing of expression was quite 
5 distinct from that of SBE I, as illustrated in 
Figures 9a, 9b and 9c. 

SBE I gene expression is only clearly detectable 
from the mid-stage of endosperm development (10 days after 
anthesis in Figure 9b) , whereas SBE II gene expression is 
10 clearly seen much earlier, in endosperm tissue at 5-8 days 
after development (Figures 9a and 9c) , corresponding to an 
early stage of endosperm development. The hybridisation of 
wheat endosperm mRNA with the actin and ribosomal RNA genes 
is shown as controls (Figures 9fa and 9g, respectively) . 

15 

Example 18 Cloning of Wheat Soluble Starch Synthase 

cDNA 

A conserved sequence region was used for the 
synthesis of primers for amplification of SSS I by 

20 comparison with the nucleotide sequences encoding soluble 
starch synthases of rice and pea. A 3 00 bp RT-PCR product 
was obtained by amplification of cDNA from wheat endosperm 
at 12 days post anthesis. The 300 bp RT-PCT product was 
then cloned, and its sequence analysed. The comparison of 

25 its sequence with rice SSS cDNA showed about 80% sequence 

homology. The 3 00 bp RT-PCR product was 100% homologous to 
the partial sequence of a wheat SSS I in the database 
produced by Block et al (1997) . 

The 300 bp cDNA fragment of wheat soluble starch 

3 0 synthase thus isolated was used as a probe for the screening 
of a wheat endosperm cDNA library (Rahman et al, 1997) . 
Eight cDNA clones were selected. One of the largest cDNA 
clones (sm2) was used for DNA sequencing analysis, and gave 
a 2662 bp nucleotide sequence, which is shown in SEQ ID 

3 5 NO: 14. A large open reading frame of this cDNA encoded a 
647 amino acid polypeptide, starting at nucleotides 247 to 
250 and terminating at nucleotides 2198 to 2200. The 
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deduced polypeptide was shown by protein sequence analysis 
to contain the N-terminal sequence of a 75 kDa granule-bound 
protein (Rahman et al, 1995) . This is illustrated in 
Figure 16. The location of the 75 kDa protein was 
5 determined for both the soluble fraction and starch granule- 
bound fraction by the method of Denyer et al (1995) . Thus 
this cDNA clone encoded a polypeptide comprising a 41 amino 
acid transit peptide and a 606 amino acid mature peptide 
(SEQ ID NO: 12) . The cleavage site LRRL was located at amino 
10 acids 36 to 39 of the transit peptide of this deduced 
polypeptide . 

Comparison of wheat SSS I with rice SSS and potato 
SSS showed that there is 87.4% or 75.9% homology at the 
amino acid level and 74.7% or 58.1% homology at the 
15 nucleotide level. Some amino acids in the at N-terminal 
sequences of the SSS I of wheat and rice were conserved. 
Major features of the SSS I gene are summarized in Table 3. 

Example 19 Isolation of Genomic Clone of Wheat Soluble 

2 0 Starch Synthase 

Seven genomic clones were obtained with a 3 00 bp 
cDNA probe by screening approximately 5 x 10 5 plaques from a 
genomic DNA library of Triticum tauschii , as described 
above. DNA was purified from 5 of these clones and digested 

25 with BamHI and Sad. Southern hybridization analysis using 
the 3 00 bp cDNA as probe showed that these clones could be 
classified into two classes, as shown in Figure 17. One 
genomic clone, sg3 , contained a long insert, and was 
digested with BamHI or Sad and subcloned into pBluescript 

30 KS+ vector. 



WO 99/14314 



PCT/AU98/00743 



- 41 - 
Table 3 

Comparison of exons and introns of soluble starch synthases 

I genes of wheat and rice 

(1) Identity of exons of soluble starch synthase I genes of 
5 wheat and rice 





Exons 


wSSI-Dl 


rSSI 


identity (%) 


start 
(wSSI 


site stop site 
-Dl) (wSSI-Dl) 




la 


255 


113 


57 .52 


-253 


0 


10 


lb 


316 


298 


58.92 


1 


316 




2 


356 


356 


82 . 87 


1473 


1828 




3 


78 


78 


92 .31 


2746 


2823 




4 


125 


125 


90 .40 


2906 


3028 




5 


82 


82 


89 . 02 


4113 


4194 


15 


6 


174 


174 


93 . 10 


4286 


4459 




7 


82 


82 


93 . 90 


4562 


4643 




8 


92 


92 


92 .39 


4743 


4835 




9 


63 


63 


90 .48 


4959 


5021 




10 


90 


90 


82 .22 


5103 


5192 


20 


11 


125 


125 


88.80 


8594 


8718 




12 


109 


109 


91 .74 


8807 


8915 




13 


53 


53 


81 . 13 


8992 


9044 




14 


40 


41 


80 . 00 


9160 


9199 




15a 


159 


113. 


79 . 65 


9499 


9657 


25 


15b 


392 


539 


46 . 46 


9658 


10098 



(2) Identity of introns of soluble starch synthase I genes 
o f whe a t and rice 



30 


Introns 


wSSI-Dl 


rSSI 


identity (%) 


start 


site stop 












(wSSI- 


-Dl) (wSSI 




1 


1156 


907 


41.05 


317 


1472 




2 


917 


851 


41. 65 


1829 


2745 




3 


82 


87 


45 . 12 


2824 


2905 


35 


4 


1084 


835 


48.50 


3029 


4112 




5 


91 


96 


57 .78 


4195 


4285 




6 


102 


189 


52.48 


4460 


4561 




7 


99 


96 


52 . 08 


4644 


4742 




8 


123 


110 


45.46 


4836 


4958 


40 


9 


81 


78 


58.97 


5022 


5102 




10 


3401 


663 


37 . 56 


5193 


8593 




11 


88 


124 


56.82 


8719 


8806 




12 


76 


81 


48 . 68 


8916 


8991 




13 


115 


135 


45.22 


9045 


9159 


45 


14 


. 299 


830 


45 .80 


9200 


9498 



Note: Exon la: non-coding region of exon 1. Exon lb: coding 
region of exon 1. 

Exon 15a: coding region of exon 15. Exon 15b: non- 
coding region of exon 15. 
50 wSSI-Dl: wheat soluble starch synthase I gene. 

rSSI : rice soluble starch synthase I gene. 
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These subclones were analysed by sequencing. The 
intron/exon structure of the- sg3 rice gene is shown in 
Figure 18. The SSS I gene from T. tauschii is shown in SEQ 
ID No: 13, while the deduced amino acid sequence is shown in 
5 SEQ ID NO: 14 . 

Example 2 0 Northern Hybridization Analysis of the 

Expression of Genes Encoding Soluble Starch 
Synthase 

10 Total RNAs were purified from leaves, pre-anthesis 

material, and various stages of developing endosperm at 5-8, 
10-15 and 18-22 days post anthesis. Northern hybridization 
analysis showed that mRNAs encoding wheat SSS I were 
specifically expressed in developmental endosperm. 

15 Expression of this mRNAs in the leaves and pre-anthesis 

materials could not be detected by northern hybridization 
analysis under this experimental condition. Wheat SSS I 
mRNAs started to express at high levels at an early stage of 
endosperm, 5-8 days post anthesis, and the expression level 

20 in endosperm at 10-15 days post anthesis, was reduced. 

These results are summarized in Figure 9a and Figure 9d. 

Example 21 Genomic Localisation of Wheat Soluble Starch 

Synthase 

25 DNA from chromosome engineered lines was digested 

with the restriction enzyme BamHI and blotted onto supported 
nitrocellulose membranes. A probe prepared from the 3 ! end 
of the cDNA sequence, from positions 2345 to 2548, was used 
to hybridise to this DNA. The presence of a specific band 

30 was shown to be associated with the presence of 

chromosomes 7 A (Figure 19) . These data demonstrate location 
of the SSS I gene on chromosome 7 . 

Example 22 Isolation of SSS I Promoter 

35 We have isolated the promoter that drives this 

pattern of expression for SSS I. The pattern of expression 
for SSS I is very similar to that for SBE II: the SSS I gene 
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transcript is detectable from an early stage of endosperm 
development until the endosperm matures. The sequence of 
this promoter is given in SEQ ID No: 15. 

5 Example 23 Isolation of the Gene Encoding Debranching 

Enzyme from Wheat 
The sugary-1 mutation in maize results in mature 
dried kernels that have a glassy and translucent appearance; 
immature mature kernels accumulate sucrose and other simple 

10 sugars, as Well as the water-soluble polysaccharide 

phytoglycogen (Black et al, 1966). Most data indicates that 
in sugary-1 mutants the concentration of amylose is 
increased relative to that of amylopec tion . Analysis of a 
particular sugary-1 mutation (su-lRef) by James et al , 

15 (1995) led to the isolation of a cDNA that shared 

significant sequence identity with bacterial enzymes that 
hydrolyse the a 1,6-glucosyl linkages of starch, such as an 
isoamylase from Pseudomonas (Amemura et al, 1988), ie . 
bacterial debranching enzymes. 

20 We have now isolated a sequence amplified from 

wheat endosperm cDNA using the polymerase chain reaction 
(PCR) . This sequence is highly homologous to the sequence 
for the sugary gene isolated by James et al, (1995) . This 
sequence has been used to isolate homologous cDNA sequences 

2 5 from a wheat endosperm library and genomic sequences from 

Triticum tauschii. 

Comparison of the deduced amino acid sequences of 
DBE from maize with spinach (Accession SOPULSPO, GenBank 
database), Pseudomonas (Amemura et al, 1988) and rice 

3 0 (Nakamura et al, 1997) enabled us to deduce sequences which 

could be useful in wheat. When these sequences were used as 
PCR amplification primers with wheat genomic DNA a product 
of 256 bp was produced. This was sequenced and was compared 
to the sequence of maize sugary isolated by James et al, 
35 (1995) . The results are shown in Figure 20a and Figure 20b. 
This sequence has been termed wheat debranching enzyme 
sequence I (WDBE-I) . 
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WDBE-1 was used to investigate a cDNA library 
constructed from wheat endosperm (Rahman et al, 1997) 
enables us to isolate two cDNA clones which hybridise 
strongly to the WDBE-I probe. The nucleotide sequence of 
5 the DNA insert in the longest of these clones is given in 
SEQ ID No : 16 . 

Use of WDBE 1 to investigate a genomic library 
constructed from T . tauschii , as described above has led to 
the isolation of four genomic clones, designated II, 12,. 13 
10 and 14, respectively, which hybridised strongly to the 

WDBE-I sequence. These clones were shown to contain copies 
of a single debranching enzyme gene. The sequence of one of 
these clones, 12, is given in SEQ ID No: 17. The intron/exon 
structure of the gene is shown in Figure 20c. Exons 1 to 4 
15 were identified by comparison with the maize sugrary-1 cDNA, 
while Exons 5 to 18 were identified by comparison with the 
cDNA sequence given in SEQ ID No: 16. The major features of 
the DBE I gene are summarized in Table 4 . 

Hybridization of WDBE-I to DNA from T. tauschii 
20 indicates one hybridizing fragment (Figure 21a) . The 
chromosomal location of the gene was shown to be on 
chromosome 7 through hybridisation to nullisomic / tetrasomic 
lines of the hexaploid wheat cultivar Chinese Spring 
(Figure 21b) . 

2 5 We have clearly isolated a sequence from the wheat 

genome that has high identity to the debranching enzyme cDNA 
of maize characterised by James et al (1997) . The isolation 
of homologous cDNA sequences and genomic sequences enables 
further characterisation of the debranching enzyme cDNA and 
30 promoter sequences from wheat and T. tauschii . These 

sequences and the WDBE I sequences shown herein are useful 
in the manipulation of wheat starch structure through 
genetic manipulation and in the screening for mutants at the 
equivalent sugary locus in wheat . 

3 5 Figure 9e shows that the DBE I gene is expressed 

during endosperm development in wheat and that the timing of 
expression is similar to the SBEII and SSSI genes. Figure 9h 
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shows that the full length mRNA for the gene (3.0 kb) is 
found only in the wheat endosperm. 

Example 24 Transient assays of Promoter-GFP Fusions 

5 DNA constructs 

DNA constructs for transient expression assays 
were prepared by fusing sequences from the BEII and SSI 
promoters to the gene encoding the Green Fluorescent 
Protein. Green Fluorescent Protein (GFP) constructs 
10 contained the GFP gene described by Sheen et al . (1995). The 
nos 3' element (Bevan et al . , 1983) was inserted 3' of the 
GFP gene. The plasmid vector (pWGEM__NZf p) was constructed by 
inserting the NotI to Hindlll fragment from the following 
sequence : 

15 

5' GCGGCCGCTC CCTGGCCGAC TTGGCCGAAG CTTGCATGCC TGCAGGTCGA 
CTCTAGAGGA TCCCCGGGTA CCGAGCTCGA -ATTCATCGAT GATATCAGAT 
CCGGGCCCTC TAGATGCGGC CGCATGCATA AGCTT 3 ' 

20 into the NotI and Hindlll sites of pGem-13Zf(-) vector 

(Promega) . The sequences at the junction of the wSSSIprol 
and wSSSIpro2 and GFP were identical, and included the 
junction sequence : 

2 5 5 ' . . . . CGCGCGCCCA CACCCTGCAG GTCGACTCTA GAGGATCCAT GGTGAGCAAG 

3 ' . 

The sequence at the junction of wsbellprol and GFP was: 

3 0 5 ' GCGACTGGCT GACTC AATCA CTACGCGGGG ATCC ATGGTG AGCAAGGGCG 

3 ' . 

The sequence at the junction of wsbeIIpro2 and GFP was: 

5' GGACTCCTCT CGCGCCGTCC TGAGCCGCGG ATCCATGGTG AGCAAGGGCG 
35 3' . 

The structures of the constructs are shown in Figures 22a to 

22f . 
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Table 4 

Structural features of wDBEI-Dl 



Exon 
number 



Start 

positi 

on 



End 

posit 

ion 



Comments 



1 


1890 


2241 


n 
Z 


0 7 A 0 
£ J Ht £ 


Z J44 


3 


2615 


2707 


4 


3016 


3168 


5 


3360 


3436 


6 


4313 


4454 


7 


4526 


4633 


8 


4734 


4819 


9 


5058 


5129 


10 


5202 


5328 


11 


5558 


5644 


12 


6575 


6671 


13 


7507 


7661 


14 


8450 


8527 


15 


8739 


8823 


16 


8902 


8981 


17 


9114 


9231 


18 


Still 
being 
sequen 
ced 





(deduced by comparison with maize) 

(deduced by comparison with maize) 

(deduced by comparison with maize) 

(deduced by comparison with maize) 



Note that following nucleotides 3330, 6330 and 8419 
may be short regions of DNA not yet sequenced. 



there 



10 



B. 

CAAAAT motif 
TCAAT motif 
ATAAATAA motif 



1833 
1838 
1804 



Endosperm box like motif TAAAACG 1463 



WO 99/14314 



PCT/AU98/00743 



- 47 - 

Preparation of target tissue 

All explants used for transient assay were from 
the hexaploid wheat cultivar, Milliwang. Endosperm (10-12 
days after anthesis), embryos (12 - 14 days after anthesis) 
5 and leaves (the second leaf from the top of plants 

containing 5 leaves) were used. Developing seed or leaves 
were collected, surface sterilized with 1.25% w/v sodium 
hypochlorite for 20 minutes and rinsed with sterile 
distilled water 8 times. Endosperms or embryos were 

10 carefully excised from seed in order to avoid contamination 
with surrounding tissues. Leaves were cut into 0.5 cm x 1 
cm pieces. All tissues were aseptically transferred onto 
SD1SM medium, which is an MS based medium containing 1 mg/L 
2,4-D, 150 mg/L L-asparagine, 0.5 mg/L thiamine, 10 g/L 

15 sucrose, 36 g/L sorbitol and 36 g/L mannitol. Each agar 

plate contained either 12 endosperms, 12 embros or 2 leaf 
segments . 

Preparation of gold particles and bombardment 

2 0 Five (ig of each plasmid was used for the 

preparation of gold particles, as described by Witrzens et 
al . (1998). Gold particle-DNA suspension in ethanol (10 |il) 
was used for each bombardment using a Bio-Rad helium-driven 
particle delivery system, PDS-1000. 

25 

GFP assay 

The expression of GFP was observed after 36 to 72 
hours incubation using a fluorescence microscope. Two plates 
were bombarded for each construct. The numbers of expressing 
30 regions were recorded for each target tissue, and are 

summarized in Table 5. The intensity of the expression of 
GFP from each of the promoters was estimated by visual 
comparison of the light intensity emitted, and is summarized 
in Table 6 . 

3 5 The DNA construct containing GFP without a 

promoter region (pZLGFPNot) gave no evidence of transient 
expression in embryo (panel 1) or leaf (panel r) and 
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extremely weak and sporadic expression in endosperm (panel 
f ) , this construct gave only very weak expression in 
endosperm with respect to the number (Figure 5) and 
intensity (Figure 6) of transient expression regions. The 
5 constructs pwssslprolgf pNOT (panels b, h and n) , 

psbellprolgfpNOT (panels d, j and p) , and psbeIIpro2gf pNOT 
(panels e, k and q) yielded low numbers (Table 5) of 
strongly (Table 6) expressing regions in leaves, and there 
was a very uneven distribution of expressing regions between 

10 target leaf pieces (Table 5) . pwsssIpro2gf pNOT (panels c, i 
and o) gave no evidence of transient expression in leaves 
(Table 5). These results show that each of the promoter 
constructs is able to drive the transient expression of GFP 
in the grain tissues, endosperm and embryo. The ability of 

15 the short SSI promoter (pwsssIpro2gf pNOT containing 1042 bp 
5' of the ATG translation start site) to drive expression in 
leaves (panel n) contrasts with the inability of the long 
SSI promoter (pwsssIpro2gf pNOT containing 3914 base pair 
region 5' of the ATG translation start site, panel o) ) 

20 suggesting that regions for controlling tissue specificity 
are located between -3914 and -1042 of the SSI promoter 
region (SEQ ID No: 15} . 

Example 2 5 Stable transformation of rice 

25 Stable transformation of rice using Agrobacterium 

was carried out essentially as described by Wang et al. 
1997. The plasmids containing the target DNA constructs 
containing the promoter-reporter gene fusions are shown in 
Figure 23 . These plasmids were transformed into 
30 Agrobacterium tumefaciens AGL1 by elec troporation . and 
cultured on selection plates of LB media containing 
rif ampicillin (50 mg/L) and spectinomycin (50 mg/L) for 2 to 
3. days, and then gently suspended in 10 ml NB liquid medium 
containing 100 fiM acetosyrihgone and mixed well. Embryogenic 
3 5 rice calli (2 to 3 months old) derived from mature seeds 
were immersed in the A. tumefaciens AGL1 
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Table 6 

Comparison of the Intensities of Transient Expression 



Tissue 



Endosperm 

Embryo 

Leaf 



pact_j 

s- 
gf g_no 

s 

10 
10 
10 



pwsssl pwsssl psbell 



prolgf 
pNOT 

4 
5 . 5 
20 



pro2gf 
pNOT 
2.5 
5.5 
0 



prolgf 
pNOT 
3 .5 
1.5 
10 



psbell pZLGFP 
Not 

pro2gf 
pNOT 

1.5 0.5 

1 0 

10 0 



5 All intensities are relative to pac t_j s -gf g_nos transient 
expression in the target tissue 

Relative intensities were independently scored by three 
researchers and averaged . 
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suspension. After 3-10 minutes the A. tumefaciens AGL1 
suspension medium was removed, and the rice calli were 
transferred to NB medium containing 100 |IM acetpsyringone 
for 48 h. The co-cultivated calli were washed with sterile 
Milli Q H 2 0 containing 150 mg/L timentin 7 times to remove 
all Agrobacterium, plated on to NB medium containing 150 
mg/L timentin and 3 0 mg/L hygromycin, and cultured for 3 to 
4 weeks. Newly-formed buds on the surface of rice calli were 
excised and plated onto NB Second Selection medium 
containing 150 mg/L timentin and 50 mg/L hygromycin. After 4 
weeks of proliferation calli were plated onto NB Pre- 
Regeneration medium containing 150 mg/L timentin and 50 mg/L 
hygromycin, and cultured for 2 weeks. The calli were then 
transferred on to NB-Regenerat ion medium containing 150 mg/L 
timentin and 50 mg/L hygromycin for 3 to 4 weeks. Once 
shooting occurs, shoots are transferred onto rooting medium 
(V4 MS) containing 50 mg /L hygromycin. Once adequate root 
formation occurs, the seedlings are transferred to soil, 
grown in a misting chamber for 1-2 weeks, and grown to 
maturity in a containment glasshouse. 

Example 2 6 Use of probes from SSS I, SBE I, SBE II and 

DBE sequences to identify null or altered 
alleles for use in breeding programmes 

2 5 DNA primer sets were designed to enable 

amplification of the first 9 introns of the SBE II gene 
using PGR. The design of the primer sets is illustrated in 
Figure 24. Primers were based on the wSBE II-Dl sequence 
(deduced from Figure 13b and Nair et al, 1997; Accession No. 
30 Y11282) and were designed such that intron sequences in the 
wSBE II sequence were amplified by PCR. These primer sets 
individually amplify the first 9 introns of SBE II. One 
primer (sr913F) contained a fluorescent label at the 5* end. 
Following amplification, the products were digested with the 

3 5 restriction enzyme Ddel and analysed using an ABI 377 DNA 

Sequencer with Genescan™ fragment analysis software. One 
primer set, for intron 5, was found to amplify products from 



10 



15 
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each of chromosomes 2A, 2B and 2D of wheat. This is shown in 
Figure 25, which illustrates results obtained with various 
wheat lines, and demonstrates that products from each of the 
wheat genomes from diverse wheats were amplified, and that 
5 therefore lines lacking the wSBEII gene on a specific 
chromosome could be readily identified. Lane (iii) 
illustrates the identification of the absence of the A 
genome wSBEII gene from the hexaploid wheat cultivar Chinese 
Spring ditelosomic line 2AS . 

10 Figure 26 compares results of amplification with 

an Intron 10 primer set for various nullisomic/ tetrasomic 
lines of the hexaploid wheat Chinese Spring. Fluorescent 
dUTP deoxynucleotides were included in the amplification 
reaction. Following amplification, the products were 

15 digested with the restriction enzyme Ddel and analysed using 
an ABI 377 DNA Sequencer with Genescan™ fragment analysis 
software. In lane (i) Chinese Spring ditelosomic line 2AS, a 
300 base product is absent; in lane (ii) N2BT2A, a 204 base 
product is absent, and in lane (iii) N2DT2B a 191 base 

20 product is absent. These results demonstrate that the 
absence of specific wSBEII genes on each of the wheat 
chromosomes can be detected by this assay. Lines lacking 
wSBEII forms can be used as a parental line for breeding 
programmes for generation of new lines in which expression 

25 of SBE II is diminished or abolished, with consequent 

increase in amylose content of the wheat grain. Thus a high 
amylose wheat can be produced. 

Table 7 shows examples primers pairs for SBE I, 
SSS I and DBE I which can identify genes from individual 

30 wheat genomes and could therefore be used to identify lines 
containing null or altered alleles. Such tests could be used 
to enable the development of wheat lines carrying null 
mutations in each of the genomes for a specific gene (for 
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example SBEI, SSI or DBE I) or combinations of null alleles 
for different genes. 

It will be apparent to the person skilled in the 
art that while the invention has been described in some 
5 detail for the purposes of clarity and understanding, 

various modifications and alterations to the embodiments and 
methods described herein may be made without departing from 
the scope of the inventive concept disclosed in this 
specification . 

10 Reference cited herein are listed on the following 

pages, and are incorporated herein by this reference. 
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SEQUENCE LISTING 



( I ) GENERAL INFORMATION: 



10 



15 



20 



25 



30 



35 



(i) APPLICANT: 

(A) NAME: COMMONWEALTH SCIENTIFIC AND INDUSTRIAL 
RESEARCH ORGANISATION 

(B) STREET; Limestone Avenue 

(C) CITY: Campbell 

(D) STATE: ACT 

(E) COUNTRY: AUSTRALIA 

(F) POSTAL CODE (ZIP): 2612 

A) NAME: THE AUSTRALIAN NATIONAL UNIVERSITY 

B) STREET: BRIAN LEWIS CRESCENT 

C) CITY: ACTON 

D) STATE: ACT 

E) COUNTRY: AUSTRALIA 

F) POSTAL CODE (ZIP): 2601 

A) NAME: GOODMAN FIELDER LIMITED 

B) STREET: LEVEL 42, GROSVENOR PLACE 

C) CITY: SYDNEY 

D) STATE: NSW 

E) COUNTRY: AUSTRALIA 

F) POSTAL CODE (ZIP): 2000 

A) NAME: GROUPE LIMAGRAIN PACIFIC PTY LIMITED 

B) STREET: LEVEL 31, 1 O'CONNELL STREET 

C) CITY: SYDNEY 

D) STATE: NSW 

E) COUNTRY: AUSTRALIA 

F) POSTAL CODE (ZIP): 2000 

(ii) TITLE OF INVENTION: REGULATION OF GENE EXPRESSION IN PLANTS 



(iii) NUMBER OF SEQUENCES: 17 

(iv) COMPUTER READABLE FORM: 
4 0 (A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0. Version #1.30 (EPO) 

4 5 (2) INFORMATION FOR SEQ ID NO: 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

5 0 (D) TOPOLOGY: linear 



55 



(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "per primer based on the N-ierminal sequence of wSBE I 5 ' end at 
position 168 of SEQ ID NO:5" 

(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: 

(v) FRAGMENT TYPE: 

5 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1 : 

10 

GGC ACGCGAG AG ACTGG 1 7 

(2) INFORMATION FOR SEQ ID NO: 2: 
(i) SEQUENCE CHARACTERISTICS: 
1 5 (A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

2 0 (ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "per primer in which 5 1 end is at position 1590 of SEQ ID NO:5" 

(iii) HYPOTHETICAL: NO 

2 5 (iv) ANTI-SENSE: 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

3 0 (A) ORGANISM: triticum tauschii 

(F) TISSUE TYPE: Endosperm 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

3 5 TAC ATTTCCT TGTCC ATC A 1 9 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18 base pairs 

4 0 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

4 5 (A) DESCRIPTION: /desc = "per primer 5 ' end is at position I of SEQ ID NO:5" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: 

50 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: triticum tauschii 

5 5 (F) TISSUE TYPE: Endosperm 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
ATCACG AG AG CTTGCTCA 1 8 

5 (2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
1 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "per primer 5 ' end is at position 334 of SEQ ID NO:5" 

1 5 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: 

(v) FRAGMENT TYPE: 

20 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

2 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

CGGTACACAG TTGCGTCATT TTC 23 

(2) INFORMATION FOR SEQ ID NO: 5: 

3 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2687 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

40 (iv) ANTI-SENSE: 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
ATCGACGAAG ATGCTCTGCC TCACCGCCCC CTCCTGCTCG CCATCTCTCC CGCCGCGCCC 60 
50 CTCCCGTCCC GCTGCTGACC GGCCCGGACC GGGGATTTCG GCCAAGAGCA AGTTCTCTGT 12 0 
TCCCGTGTCT GCGCCAAGAG ACTACACCAT GGCAACAGCT GAAGATGGTG TTGGCGACCT 180 
TCCGATATAC GATCTGGATC CGAAGTTTGC CGGCTTCAAG GAACACTTCA GTTATAGGAT 240 
GAAAAAGTAC CTTGACCAGA AACATTCGAT TGAGAAGCAC GAGGGAGGCC TTGAAGAGTT 300 



CTCTAAAGGC TATTTGAAGT TTGGGATCAA CACAGAAAAT GACGCAACTG TGTACCGGGA 3 60 
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ATGGGCCCCT GCAGCAATGG ATGCACAACT TATTGGTGAC TTCAACAACT GGAATGGCTC 420 
TGGGCACAGG ATGACAAAGG ATAATTATGG TGTTTGGTCA ATCAGGATTT CCCATGTCAA 4 80 

5 

TGGGAAACCT GCCATCCCCC ATAATTCCAA GGTTAAATTT CGATTTCACC GTGGAGATGG 540 
ACTATGGGTC GATCGGGTTC CTGCATGGAT TCGTTATGCA ACTTTTGACG CCTCTAAATT 600 
10 TGGAGCTCCA TATGACGGTG TTCACTGGGA TCCACCTTCT GGTGAAAGGT ATGTGTTTAA 660 
GCATCCTCGG CCTCGAAAGC CTGACGCTCC ACGTATTTAC GAGGCTCATG TGGGGATGAG 720 
TGGTGAGAGG CCTGAAGTAA GC AC AT AC AG AGAATTTGCA GACAATGTGT TACCGCGCAT 7 80 

15 

AAAGGCAAAC AACTACAACA CAGTTCAGCT GATGGCAATC ATGGAACATT CCATATTATG 840 
CTTCTTTTGG TACCATGTGA CGAATTTCTT CGCAGTTAGC AGCAGATCAG GAACACCAGA 900 
20 GGACCTCAAA TATCTTGTTG ACAAGGCACA TAGCTTAGGG TTGCGTGTTC TGATGGATGT 9 60 
TGTCCATAGC CATGCGAGCA GTAATATGAC AGATGGTCTA AATGGCTATG ATGTTGGACA 102 0 
AAACACACAG GAGTCCTATT TCCATACAGG AGAAAGGGGT TATCATAAAC TGTGGGATAG 1080 

25 

TCGCCTGTTC AACTATGCCA ATTGGGAGGT CTTACGGTAT CTTC TTTCT A ATCTGAGATA 1140 
TTGGATGGAC GAATTCATGT TTGACGGCTT CCGATTTGAT GGAGTAACAT CC ATGC TATA 1200 
30 TAATCACCAT GGTATCAATA TGTCATTCGC TGGAAATTAC AAGGAATATT TTGGTTTGGA 1260 
TACCGATGTA GATGCAGTTG TTT AC ATG AT GCTTGCGAAC CATTTAATGC ACAAAATCTT 13 20 
GCCAGAAGCA ACTGTTGTTG CAGAAGATGT TTCAGGCATG CCAGTGCTTT GTCGGTCAGT 13 80 

35 

TGATGAAGGT GGAGTAGGGT TTGACTATCG CCTTGCTATG GCTATTCCTG ATAGATGGAT 1440 
TGACTACTTG AAGAACAAAG ATGACCTTGA ATGGTCAATG AGTGCAATAG CACATACTCT 1500 
40 GACCAACAGG AGATATACGG AAAAGTGCAT TGCATATGCT GAGAGCCACG ATCAGTCTAT 15 60 
TGTTGGCGAC AAGACTATGG CATTTCTCTT GATGGACAAG GAAATGTATA CTGGCATGTC 162 0 
AGACTTGCAG CCTGCTTCAC CTACAATTGA TCGTGGAATT GCACTTCAAA AGATGATTCA 16 80 

45 

CTTCATCACC ATGGCCCTTG GAGGTGATGG CTACTTGAAT TTTATGGGTA ATGAGTTTGG 1740 
CCACCCAGAA TGGATTGACT TTCCAAGAGA AGGCAACAAC TGGAGTTATG ATAAATGCAG 1800 
50 ACGCCAGTGG AGCCTCTCAG ACATTGATCA CCTACGATAC AAGTACATGA ACGCATTTGA 1860 
TCAAGCAATG AATGCGCTCG ACGACAAGTT TTCCTTCCTA TCGTCATCAA AGCAGATTGT 1920 
CAGCGACATG AATGAGGAAA AGAAGATTAT TGTATTTGAA CGTGGAGATC TGGTCTTCGT 1980 

55 

CTTCAATTTT CATCCCAGTA AAACTTATGA TGGTTACAAA GTCGGATGTG ATTTGCCTGG 2040 
GAAGTACAAG GTAGCTC TGG ACTCCGATGC TCTGATGTTT GGTGGACATG GAAGAGTGGC 2100 
60 CCAGTACAAC GATCACTTCA CGTCACCTGA AGGAGTACCA GGAGTACCTG AAACAAACTT 2160 
CAACAACCGC CCTAATTCAT TCAAAGTCCT GTCTCCACCC CGCACTTGTG TGGCTTACTA 2220 
TCGCGTCGAG GAAAAAGCGG AAAAGCCTAA GGATGAAGGA GCTGCTTCTT GGGGCAAAGC 22 80 

65 

TGCTCCTGGG TACATCGATG TTGAAGCCAC TCGTGTCAAA GACGCAGCAG ATGGTGAGGC 2 3 40 
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10 



40 



55 



- 65 - 

GACTTCTGGT TCCAAAAAGG CGTCTACAGG AGGTGACTCC AGCAAGAAGG GAATTAACTT 2400 

TGTCTTCGGG TCACCTGACA AAGATAACAA ATAAGCACCA TATCAACGCT TGATCAGAAC 2460 

CGTGTACCGA CGTCCTTGTA ATATTCCTGC TATTGCTAGT AGTAGCAATA CTGTCAAACT 2 52 0 

GTGCAGACTT GAGATTCTGG CTTGGACTTT GCTGAGGTTA CCTACTATAT AGAAAGATAA 2580 

ATAAGAGGTG ATGGTGCGGG TCGAGTCCGG CTATATGTGC CAAATATGCG CCATCCCGAG 2 640 

TCCTCTGTCA TAAAGG AAGT TTCGGGCTTT CAGCCCAGAA TAAAAAA 2687 



(2) INFORMATION FOR SEQ ID NO: 6: 
(i) SEQUENCE CHARACTERISTICS: 
1 5 (A) LENGTH: 807 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

2 0 (ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: 

25 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

3 0 (ix) FEATURE: 

(A) NAME/ KEY: Protein 

(B) LOCATION: I. .807 

(D) OTHER INFORM ATION:/label= sbel 
/note= "deduced amino acid sequence from SEQ ID NO:5 M 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Leu Cys Leu Thr Ala Pro Ser Cys Ser Pro Ser Leu Pro Pro Arg 
15 10 15 

Pro Ser Arg Pro Ala Ala Asp Arg Pro Gly Pro Gly lie Ser Ala Lys 
20 25 30 

Ser Lys Phe Ser Val Pro Val Ser Ala Pro Arg Asp Tyr Thr Met Ala 
45 35 40 45 

Thr Ala Glu Asp Gly Val Gly Asp Leu Pro lie Tyr Asp Leu Asp Pro 
50 55 60 

50 Lys Phe Ala Gly Phe Lys Glu His Phe Ser Tyr Arg Met Lys Lys Tyr 

65 70 75 80 



Leu Asp Gin Lys His Ser lie Glu Lys His Glu Gly Gly Leu Glu Glu 

85 90 95 

Phe Ser Lys Gly Tyr Leu Lys Phe Gly lie Asn Thr Glu Asn Asp Ala 

100 105 110 



Thr Val Tyr Arg Glu Trp Ala Pro Ala Ala Met Asp Ala Gin Leu lie 
60 115 120 125 
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Gly Asp Phe Asn Asn Trp Asn Gly Ser Gly His Arg Met Thr Lys Asp 
130 135 140 

Asn Tyr Gly Val Trp Ser lie Arg lie Ser His Val Asn Gly Lys Pro 
5 145 150 155 160 

Ala lie Pro His Asn Ser Lys Val Lys Phe Arg Phe His Arg Gly Asp 
165 170 175 

10 Gly Leu Trp Val Asp Arg Val Pro Ala Trp He Arg Tyr Ala Thr Phe 

180 185 190 



15 



30 



45 



60 



Asp Ala Ser Lys Phe Gly Ala Pro Tyr Asp Gly Val His Trp Asp Pro 
195 200 205 

Pro Ser Gly Glu Arg Tyr Val Phe Lys His Pro Arg Pro Arg Lys Pro 
210 215 220 



Asp Ala Pro Arg He Tyr Glu Ala His Val Gly Met Ser Gly Glu Arg 
20 225 230 235 240 

Pro Glu Val Ser Thr Tyr Arg Glu Phe Ala Asp Asn Val Leu Pro Arg 
245 250 255 

25 He Lys Ala Asn Asn Tyr Asn Thr Val Gin Leu Met Ala He Met Glu 

260 265 270 



His Ser He Leu Cys Phe Phe Trp Tyr His Val Thr Asn Phe Phe Ala 
275 280 285 

Val Ser Ser Arg Ser Gly Thr Pro Glu Asp Leu Lys Tyr Leu Val Asp 
290 295 300 



Lys Ala His Ser Leu Gly Leu Arg Val Leu Met Asp Val Val His Ser 
35 305 310 315 320 

His Ala Ser Ser Asn Met Thr Asp Gly Leu Asn Gly Tyr Asp Val Gly 
325 330 335 

40 Gin Asn Thr Gin Glu Ser Tyr Phe His Thr Gly Glu Arg Gly Tyr His 

340 345 350 



Lys Leu Trp Asp Ser Arg Leu Phe Asn Tyr Ala Asn Trp Glu Val Leu 

355 360 365 

Arg Tyr Leu Leu Ser Asn Leu Arg Tyr Trp Met Asp Glu Phe Met Phe 

370 375 380 



Asp Gly Phe Arg Phe Asp Gly Val Thr Ser Met Leu Tyr Asn His His 
50 385 390 395 400 

Gly He Asn Met Ser Phe Ala Gly Asn Tyr Lys Glu Tyr Phe Gly Leu 
405 410 415 

55 Asp Thr Asp Val Asp Ala Val Val Tyr Met Met Leu Ala Asn His Leu 

420 425 430 



Met His Lys He Leu Pro Glu Ala Thr Val Val Ala Glu Asp Val Ser 
435 440 445 

Gly Met Pro Val Leu Cys Arg Ser Val Asp Glu Gly Gly Val Gly Phe 
450 455 460 
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Asp Tyr Arg Leu Ala Met Ala lie Pro Asp Arg Trp lie Asp Tyr Leu 
465 470 475 480 

Lys Asn Lys Asp Asp Leu Glu Trp Ser Met Ser Ala lie Ala His Thr 
5 485 490 495 

Leu Thr Asn Arg Arg Tyr Thr Glu Lys Cys lie Ala Tyr Ala Glu Ser 

500 505 510 

10 His Asp Gin Ser lie Val Gly Asp Lys Thr Met Ala Phe Leu Leu Met 

515 520 525 



15 



Asp Lys Glu Met Tyr Thr Gly Met Ser Asp Leu Gin Pro Ala Ser Pro 
530 535 540 

Thr lie Asp Arg Gly lie Ala Leu Gin Lys Met lie His Phe lie Thr 
545 550 555 560 

Met Ala Leu Gly Gly Asp Gly Tyr Leu Asn Phe Met Gly Asn Glu Phe 
20 565 570 575 

Gly His Pro Glu Trp lie Asp Phe Pro Arg Glu Gly Asn Asn Trp Ser 
580 585 590 

25 Tyr Asp Lys Cys Arg Arg Gin Trp Ser Leu Ser Asp lie Asp His Leu 

595 * 600 605 



30 



Arg Tyr Lys Tyr Met Asn Ala Phe Asp Gin Ala Met Asn Ala Leu Asp 
610 615 620 

Asp Lys Phe Ser Phe Leu Ser Ser Ser Lys Gin lie Val Ser Asp Met 
625 630 635 640 

Asn Glu Glu Lys Lys lie lie Val Phe Glu Arg Gly Asp Leu Val Phe 
35 645 650 655 

Val Phe Asn Phe His Pro Ser Lys Thr Tyr Asp Gly Tyr Lys Val Gly 
660 665 670 

4 0 Cys Asp Leu Pro Gly Lys Tyr Lys Val Ala Leu Asp Ser Asp Ala Leu 

675 680 685 

Met Phe Gly Gly His Gly Arg Val Ala Gin Tyr Asn Asp His Phe Thr 
690 695 700 

45 

Ser Pro Glu Gly Val Pro Gly Val Pro Glu Thr Asn Phe Asn Asn Arg 
705 710 715 720 

Pro Asn Ser Phe Lys Val Leu Ser Pro Pro Arg Thr Cys Val Ala Tyr 
50 725 730 735 

Tyr Arg Val Glu Glu Lys Ala Glu Lys Pro Lys Asp Glu Gly Ala Ala 
740 745 750 

55 Ser Trp Gly Lys Ala Ala Pro Gly Tyr lie Asp Val Glu Ala Thr Arg 

755 760 765 

Val Lys Asp Ala Ala Asp Gly Glu Ala Thr Ser Gly Ser Lys Lys Ala 
770 775 780 

60 

Ser Thr Gly Gly Asp Ser Ser Lys Lys Gly lie Asn Phe Val Phe Gly 
785 790 795 800 
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Ser Pro Asp Lys Asp Asn Lys 
805 

(2) INFORMATION FOR SEQ ID NO: 7: 
5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 319 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

15 (iv) ANTI-SENSE: 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

20 

(ix) FEATURE: 

(A) NAME/KEY: misc_signal 

(B) LOCATION: 1..3 19 

(D) OTHER INFORM ATION:/function= "3* untranslated region 

2 5 of wSBE I-D4 cDNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

GCGACTTCTG GTTCCAAAAA GGCGTCTACA GGAGGTGACT CCAGCAAGAA GGGAATTAAC 60 

30 

TTTGTCTTCG GGTCACCTGA CAAAGATAAC AAATAAGCAC CATATCAACG CTTGATCAGA 120 
ACCGTGTACC GACGTCCTTG TAATATTCCT GCTATTGCTA GTAGTAGCAA TACTGTCAAA 180 

3 5 CTGTGCAGAC TTGAGATTCT GGCTTGGACT TTGCTGAGGT TACCTACTAT ATAGAAAGAT 240 

AAATAAGAGG TGATGGTGCG GGTCGAGTCC GGCTATATGT GCCAAATATG CGCCATCCCG 3 00 
AGTCCTCTGT CATAAAGGA 319 

40 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4890 base pairs 

(B) TYPE: nucleic acid 

4 5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

5 0 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: 

(vi) ORIGINAL SOURCE: 
5 5 (A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 



(ix) FEATURE: 
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(A) NAME/KEY: promoter 

(B) LOCATION: I. .4890 

(D) OTHER INFORMATION:/function= "promoter containing 
sequence of SBE I" 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GGGTGGCGGG TCGGGCGGCA AGGCGCGGGG CGGCGGGGCG GCCGGGGCGG CGCGGCGGCG 60 

10 CGGGCGGCAG CGGCGGCTAG GGTTTCGCGG CGGCGGCGAC TTGGGCTGAG GCGGGGCACG 12 0 

GGCTGCGGCT TTAAAGGCCG GCCAGGCTGA GGTGTCCGGG TCGGACACGG CCCGTAAGGC 180 

GGTTGACTTT AAAAAATAAT AATTCGGACA TGCAAAAAAG TAAGAAAAGA AATAATAAAC 240 

GGACTCCAAA AATCCCGAAG TAAATTTTTC CCCATTCTTA AAAATAAGCC GGACAAGATG 3 00 

AACATTTATT TGGGCCTAAA ATGCAATTTT GAAAAATGCG TATTTTTCCT AATTCGGAAT 3 60 

2 0 AAAATCAAAT AAAATCCAAA TAAAATCAAA TATTTGTTTT TAATATTTTT CCTCCAATAT 420 

TTCATTATTT GTGAAGAAGT CATTTTATCC CATCTCATAT ATTTTGATAT GAAATATTTT 480 

CGGAGAGAAA AATAATTAAA ACAAATGATC CTATTTTCAA AATTTGAGAA AACCCAAATA 540 

25 

TGAAAATAAC GAAATCCCCA ACTCTCTCCG TGGGTCCTTG AGTTGCGTGA AATTTCTAGG 600 

ATCACAAATC AAAATGCAAT AAAATATGAT ATGCATGATG ATCTAATGTA TAACATTCCA 660 

30 ATTGAAAATT TGGGATGTTA CATATAACTC AAATTC TATA ATTATGAACA CAGAAATATT 72 0 

AATGTAGAAC TCTATTTTGT TTTGAAATTG TATTATTTTT T AG AATTAG T CTAGAGCATT 7 80 

TCGTGAACTT GAATCAAACC TTTAAATAAA ACAAAGCATA AAAATGACAA ATTCACATAT 840 

35 GAAATAACTT GTGTTACATA GATTTATTAC AATAGCGTTG TATGTGTGTA TGTGTGCGTG 9 00 

AGTGCCTATG GTAATATCAA TAAATATCTT GATAGATGTT TCTACAATTC AC GGGTC T AA 960 

40 CTAGTAATGC AATGCAATGC ATGCTAAAAG AATAGAACCT TAGTTTCATT TAACTAACAA 1020 

TTTTCAAATG TATGAGTTGC CAACAAGTGG CATACTTGGC ACTGTTTGTT TGTTCATTTT 1080 

ATGGAAAGTT CTTCTCTTTT TACATGGTTT AGATTCCAGC ATGTAGCCAC AAAATATGAT 1140 

45 

TGTCAAAAGA TAATACCTCA TAATACAATT CCACTAAAGT CACCTAGCCC AAGTGACCGA 1200 

CCTGATCCTG AAATAAAATC AGAAGATTTG GTGTCATCAT CATGACAACA AATTATTAGG 12 60 

50 CGGTAGATCT TGTGGTAGTA CTCATGATGT AAAATT ATC A AGAGGGAGAG AATGTATGGA 13 20 

GATTTATGTG AAGTACATCG TACACCAGAC ATAGTTGACA CATCGATTTT TTAAGATACA 13 80 

TTTGGACGCG CCTTGTGGGA GTGTAAAGTA CTACCATGTA TTAGAAGAGG TGAAATGAGA 1440 

55 icnrt 

AATGCCATAG CTAGCAAGTA GGCCTAGTTA AGG AAATTC T TCCTTAGATC CCCTTCTCCC 1500 
GAAGAGTGAA GTGCTTCAAC TAAAGGTTAG ACCCACTTAA AAAATGTCAC TTTGAATCTT 1560 
60 TGCTTCCCTT GTCGTAATCC TGTGCATTTG TAGGTCCCTC GGATCTGAGC CCTTTCTCCA 1620 
AGCCCTTCAT TGGATTCCCC TGGATGTCTT TTTGTTACAT TTTATTGAAG TGAGAGTGAA 1680 
TTATTATATG CCCATAGGAG GTGGGATATA AAGGCTGTTG GTATTCTGCA CCATACATGC 1740 

65 
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TAGAGTAGGG AGGAGAGGCT GGTGCATGAT 
TCCCCCACCC ACTAACAAGT TTTTTTTATT 
5 TAGCCCATTC TTCATCATGG ACTTATTAAT 
TGACTTGAAT TTGACAATGT GCCTCATATA 
CTCACAACAT TAACTTAAAA AGGATTATTT 

10 

CTTTTATGCT AAAAGTTATT CAAACATAGA 
CCATGCGCTC TCTCATGTTT AC TC TAG AAA 
15 TCTATCCTCC TTGTTTCTGG GAATGAGTCG 
AGGCAAGTAA GAGCAACTCT AGCAGAGTCG 
GGCATTTTTG GCCCAAAATG GCACTTCAGA 

20 

ATTTAGGGAG CTCGCTCCAC AAACAAGGTT 
GTTGTTTGGC ACTCACTCCA TATCCAACCG 

2 5 CTTCCTCCTT GCGCCAACGC CGGGATTTTA 

TGC AC AG AT A ATCACCGACG AGTGGGGTGA 
TGCGCCCACT CCCCTCAAAT TCATGAGGCA 

30 

CTCCGACTAT AAAATCTCAA CGGCATCACC 
GGCATCACCT CCCCAAGACA TCTCCTCCCC 

3 5 ACACAACTAC TGGTAAACCG CATACCCAAT 

TCCTCCCACG ATGGTAGGAT ATTCTCCTCC 
CGAGGCTGAT ATGTCGGCTC CCATGATGGC 

40 

TCATACATGT TAACGAGGTC ATCCCCATTG 
ATCCTGAGGA CCCGTTCGAT GTCGCAATGC 

4 5 AGTACGTCCT CTAGGAGTTC CGCCCCGCAA 

CCCCTACGCC TTCCTCGACG ATCTCTCTTA 
GGCGAAGGTA CTGCAGGTAG TAGAACATAG 

50 

AAAATGTCGC TCAACGACTT TTGAAGTCGC 
AGCAATATAA GTTTATCACA TTTGATAATG 
55 CCATAAATTG AACATACAAA TTTTTAGCAA 
ATGAAAGCCG CATATCGCGA CTATGTGTTT 
TACTCCATAT GACATACGAC AACCATACAT 

60 

GCTTTTAGCG CCTTTCGTGC AGTGGTGCCC 
TTCCCCTTTT TTCATTTCTT TGAAATCTAT 
65 AAATTTATAT ACCATTTTTC TGTTTCTCGC 
TTTTTCTATT ATTAATTTGT GTCTCTTATG 
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ACATGGTGGA CTAGCCCATA TATTTACCCC 1800 
AGGTCTTCAT CCTCTGATTT GTTTTTCTGT 1860 
CATGATTAGT TTCTTGGATT TTTGTTTACT 1920 
TGGCATGTGG GACTGATAGG AAGATATATT 19 80 
TTTTGGTGCA GTCGTAAAGA AAACTACTTT 2040 
TTTAT AAAC A AAGGATATCA CCATGCATGA 2100 
CCATATATCT CTTTGTTGCA AAATATTTAA 2160 
GGGAAGGTAA TCTTAGGGAA GGTTAAAGTG 222 0 
CGATATGCCC AATCGCCATA ATGCCAATAT 22 80 
AGAGTCACCA TATCCCTTCG GATAGCCATA 2 3 40- 
CGAGCCTCCA AATATGGAGG CCATGGATTC 2 400 
CAAGCGCATG CATGAGGGAA GTTTTAGCTT 2 4 60 
CACAGCGCAT TACAGGTACA TGAACCAGCA 2 520 
CAAGAAGGAT AAGCACCCTC CCATTAGTGG 2 5 80 
GCCATTTGGA TGGTCATCGC GTGGCATAAG 2 64 0 
AAAACCATAG CTGCCGCCTC CCCCTTCCTC 2700 
TCTATGCCAC AATGTCATCA TTATGGAGAG 2760 
CATGGTTTAC CGGCAGTGCG AACCCCACCT 2 82 0 
TAGAATGGCG CGTGTGGCGC TTCCTCCTCC 2 88 0 
GTGCATCATT GATTTGGCGC TTCGGGTCCA 2940 
ATGTCGTTGG TCCCCTTGCC CCCCAGTCGG 3 000 
GACTCTCCAA ACTCAAAGCT CACAATGAGG 3 060 
CCATCTATAA GGAGGAGCAA CGATAGCTCT 3120 
GGAGGACAAC GGCTAGACGA CGGCGGCGGC 3180 
CAATGTCGAA TGGCGACATT GCATATTTTG 3 240 
AAATAAAATG TAGTGTGACT ACTTTTGGCC 3 3 00 
ATTTGAACCG GTGTGGTTCA ACTAAATGTA 3 3 60 
ATGAAAAAAG AAACAAGTAA GACCACAAAT 3420 
GAGCCGCAGC TGCCAAGTAC ATATGAAGCG 3 480 
ATGAAGACTC TACTAGAGTT CTCTAAGGCC 3 540 
ATAGGGAGTG AGGGTAGTTG GACTGTTCGT 3 600 
TTTATTTTTT TTCTCTTTTG TAGGTTTCCC 3 660 
TATTTTTTGT TGTTATATTC TAGTTTCATA 3720 
AGAAGTCCAG ACTTGCATAT GGAGGTGCAC 37 80 
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ACACAAACAT ATAAAGTATA AATACTAACT TGAGAAGTAT GTTTGCGTGG TCAAAAAAAC 3 840 
ATCATCAAAA CCTGCCAATA TGAGATATAG TTTTGAATAT ATCAATATGA GCAACGCAAC 3 900 
5 CATTTAAAAT GTGAACAATT GTTTTTTTAG AAAAAATATA AGAAATAACT CCAACCCAGC 3 960 
CAAACCACAT GCTATACACT TGCTCCATAT GAAACCATGT TTGCTATTGG GCAGTTGCCT 4020 

10 GAAACCGAAA GTAATGTTAG CCGTTTTTCT ATTCAAAGAA GAAGGAGAGT CGAGGTGACG 4080 
CGATGCTTAG ACGTGAGATG GGGATGACCA CAACGTCCCT ACAGAGACCT CACCGGAGAT 4140 
GGGGACATTG CAGTTGACAC GAGAGCGGTG AGGGGCTGCG ATGCGTGTGC GGCAACATGT 42 00 

15 GGCGAGGCGG ACGTCGGGCT GGCAGGTAGG GGGGAGGGGG AAGGACCGGG GGAGGAAGAA 42 60 
GAGGAGTAGC CTGCAAAACA TGGTACACCA GTTTTCTGCC CTACGAAAAC CTCATTTCAT 43 20 

2 0 TCCCCCACCC TGACAAGCAA CAACCAACCA TCGCAGTCCC ACATGTCCCT CTGGTCTTTG 43 80 

CAAAAAGTAA TTGTTCTTGC TGGACAGCGC AAAGAGTAAA CTTTTGTTAG TTTTCATTTC 444 0 
TAGAAAAAGC AATCCTTTTA TAGTTCTTTT GTGAAAGTAA TGCTTTTATA GTGATTGGGA 4500 

25 

TGTTCTTTTA GAGCAAATAT CTTCTTTTTT TTTTAGGGAA AAGAGCAAAT ATCTTCCACT 4 560 
TTTCACAAAA CTGACGAAGG CTGAAAGTGG CGAGACAGTG AGGGCCCATA GCTTTCGTCC 4620 

3 0 GGCCCAGCGG CGCACGACCG TCCACGTGCA CCCCGGCCCT CCCGGGCCCG C AG ATCCGTT 4680 

CTCCCTCGCC CCCGTTTCCC CCTCCCTCCC TCTCGTTGCT TCCACTCCAC TGTTCTCCTC 4740 
TTCCTGTCCA AAGCGGCCAC GGACCGGAAA AAAATCACGC CTTTCCGTTG GGTCTCCGGC 4800 

35 

GCCACACTCC TCCTCCGGCC GATATAAAGC GCGCGGGGCC ACGGGCCCGG CGCAAAATGG 4860 
GATTCCCGTC CGCCGCCATG GAGGAAGATG 4 890 

4 0 (2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6228 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

4 5 (D) TOPOLOGY: linear 

(n) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

50 

(iv) ANTI-SENSE: 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 

5 5 (F) TISSUE TYPE: Endosperm 

(ix) FEATURE: 

(A) NAME/KEY: misc.feature 

(B) LOCATIONS 

6 0 (D) OTHER INFORMATION :/product= "coding region of wSBE I-D4 gene" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
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ACGGGCCCGG CGCAAAATGG GATTCCCGTC 
CCGCCCCCTC CTGCTCGCCA TCTCTCCCGC 
5 CCGGACCGGG GATCTCGGTG AGTCAGTCGG 
CCGGCTCCGT TCTG CCGGGG TTTCCCTGAT 
ATGTGCGGCT GAGCGCGGTG CCCGCGCCCT 

10 

TGAGCCCTCT CCCCTGTCTA CCCAGATTTG 
ACGGAATCTG ATCCACGGTG GTTATTGGAA 
15 GGGATTCGTC CACTGAGGAA CAAGTGGATG 
GATCCGTACG CAGAATATCC CTCCTGCAGT 
AAATGTGTAT AATCTGTGCT GAATGTATCA 

20 

TCCTGTGTTG TGTCTCTACT ACTTGTTCAG 
TCATTTATGG AAGGCCAAGA GCAAGTTCTC 
25 CATGGCAACA GCTGAAGATG GTGTTGGCGA 
TGCCGGCTTC AAGGAACACT TCAGTTATAG 
GATTGAGAAG CACGAGGGAG GCCTTGAAGA 

30 

GTTTGAAACA ATAGTTACAT CTTGTGGCGT 
TTTTGTAGGC TATTTGAAGT TTGGGATCAA 
3 5 ATGGGCCCCT GCAGCAATGT AAGTTCTAGT 
GGTTAACTTA TGAAGTGCTG ATGAAACTGT 
TCTAGCTAGT AAAGAGTAGA TAAATATGAA 

40 

GGTTGGCTGG TATTCATTTC TTTTATGGCA 
CATGTATTTA CTTGTGAGTC ATTACTTTAT 
45 TTCAACAACT GGAATGGCTC TGGGCACAGG 
ATCAGGATTT CCCATGTCAA TGGGAAACCT 
CGATTTCACC GTGGAGATGG AC T ATGGGTC 

50 

ACTTTTGATG CCTCTAAATT TGG AGCTCC A 
GGTGAAAGGT CTACTTTTAG TGGCTCGAGA 
55 AACTTACATT AATGTGGAGA CATGATACTT 
GCATCCTCGG CCTCGAAAGC CTGACGCTCC 
TGGTGAAAAG CCTGAAGTAA GC AC AT AC AG 

60 

AAAGGCAAAC AACTACAACA CAGTTCAGCT 
TTCTTTTGGG TACCATGTGA CGAATTTCTT 
65 ACCTCAATAT CTTGTTGACA AGGCACATAG 
CCATAGCCAT GCGAGCAGTA ATAAGACAGA 



72 - 

CGCCGCCATC GACGAAGATG CTCTGCCTCA 60 
CGCGCCCCTC CCGTCCCGCT GCTGACCGGC 12 0 
GATCTTCATT TCTTTTCTTT TCTTTCGTTT 180 
GCGATGCCGC GCGCGCGCAG GGCGGCGGCA 240 
CTTCGCTCCG CTGGTCGTGG CCGCGGAAGG 3 00 
CGACCGTGAT CCCCTGTTGT CGCCGGGCAA 3 60 
ATAGTATATA CTACTAATAA ACTTGAGGCT 420 
CGATTTCGAT TGGATTTCTC TGCTTTATGC 4 80 
GTCTCAACCG T ATT AC TGG A TGTACAACCC 540 
ACCAATAATT GCTGCATTGT GAAAACATAA 600 
TCCTGATCTG CCGCTTATCC TAACTTTTGT 660 
TGTTCCCGTG TCTGCGCCAA GAGACTACAC 7 20 
CCTTCCGATA TACGATCTGG ATCCGAAGTT 7 80 
GATGAAAAAG TACCTTGACC AGAAACATTC 840 
GTTCTCTAAA GGTTAGCTTT TGTTTCATGT 900 
CCGCAGCACA AAAGACATAA TGCGACTCTG 9 60 
C AC AG AAAAT GACGCAACTG TGTACCGGGA 1020 
GTTGTCACGC AACTAATTGC AATGGTCGTT 1080 
CTTAAGAGTT TATGGCTTGT CTTTTCTGAT 1140 
ATATGTTTTC CCTTTTCTAG TTATGGTCAT 12 00 
ATACTTGCTT CTAACTATCT TTAGTAGATT 12 60 
GGGTGTAGGG ATGCACAACT TATTGGTGAC 1320 
ATGACAAAGG ATAATTATGG TGTTTGGTCA 13 80 
GCCATCCCCC ATAATTCCAA GGTTAAATTT 1440 
GATCGGGTTC CTGCATGGAT TCGTTATGCA 1500 
TATGACGGTG TTCACTGGGA TCCACCTTCT 1560 
GCAAGAAATC TAAGTAAAAC CCACACAATT 1620 
TTATTGCTCG TTTTGCAGGT ATGTGTTTAA 1680 
ACGTATTTAC GAGGCTCATG TGGGGATGAG 1740 
AGAATTTGCA GACAATGTGT TACCGCGCAT 1800 
GATGGCAATC ATGGAACATT CATATTATGC 1860 
CGCAGTTAGC AGCAGATCAG AACGCCAGAG 19 20 
TTTACGGTTG CGTGTTCTGA TGGATGTTGT 1980 
TGGTCTTAAT GGCTATGATG TTGGGCAAAA 2040 
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C AC AC AGG AG TCCTATTTCC ACACAGGAGA AAGGGGCTAT CATAAACTGT GGGATAGCCG 2100 
CCTGTTCAAC TATGCCAATT GGGAGTCTTA CGATTTCTTC TTTCTAATCT GAGATATTGG 2160 

5 

ATGGACGAAT TCATGTTTGA TGGCTTCCGA TTTGATGGGG TAACATCCAT GCTATATAAT 222 0 
CACCATGGTA TCAATATGTC ATTCGCTGGA AGTTACAAGG AATATTTTGG TTTGGATACT 2 280 
10 GATGTAGATG CAGTTGTTTA CCTGATGCTT GCGAACCATT TAATGCACAA ACTCTTGCCA 23 40 
GAAGCAACTG TTGTTGCAGA AGATGTTTCA GGCATGCCAG TGCTTTGTCG GTCAGTTGAT 2400 
GAAGGTGGAG TAGGGTTTGA CTATCGCCTG GCTATGGCTA TTCCTGATAG ATGGATCGAC 2 4 60 

15 

TACTTGAAGA ACAAAGATGA CCTTGAATGG TCAATGAGTG GAATAGCACA TACTCTGACC 2 520 
AACAGGAGAT ATACGGAAAA GTGCATTGCA TATGCTGAGA GCCATGATCA GGTATGTTTT 2 5 80 

2 0 CCCTCCTTTG TCGCTGTGCG TGAGTATGTG TTCTTTTTTT ATGGGG C ACT GGTCTAAGAA 2 640 

CATACAGTTC AAAGGTGAGA CACTTTCTTT GCCTGGTAGA CAAATTTGAG AAATAAACAT 27 00 
TTCGCTTGAT GACTTTTAGT TGCTTCACAA GTTCGAATTA AGTTAGTTAT ATTCTGATAA 27 60 

25 

CTAGTGATAG TACCCACTAA CCAGCTATTA CGGACCATGT AAGAATGTCC GAAGACTGCA 2 820 
GTTATATATC GTTGACTTTG TGTTCATCTA TTGAAACAAC TTAGTAGTTA ACTTTCACGC 2 8 80 

3 0 AAATTTTC AG TCTATTGTTG GCGACAAGAC TATGGCATTT CTCTTGATGG ACAAGGAAAT 294 0 

GTATACTGGC ATGTCAGACT TGCAGCCTGC TTCGCCTACA ATTGATCGTG GAATTGCACT 3 000 
TCAAAAGGTT CGATTCGTTT TAAGTATTCC TGAATTTGAT GTTCTAGTTC CAGACGAGTA 3 06 0 

35 

TTGTAATGTT CGTTGTTACT CAGAGTTCTG CTTAGTCCTT GAAGATAATG TATTCCAGTC 312 0 
CCTTTTGGTA CATTTGGCTT ATTTTGTTAC AAAT ATTTC A GATGATTCAC TTCATCACCA 3180 
40 TGGCCCTTGG AGGTGATGGC TACTTGAATT TTATGGGTAA TGAGGTAATA TCTGGTTATC 3 24 0 
TGTCAAAACT TATTTCTGAT CAATATGTTT CGGGATTCCC TCGAAAAAAA TCCTTTGGGC 3 3 00 
AGGGCGAAAA GTTTAAACAT C TGTTTTC T A TGATAGCCAA GTACTCCCCA GCTATTTCCA 3 3 60 

45 

TGTTATCACG TATCATTTAG CTGTGCCGGT AGTTAATCTT TATTCTAATT CATTGTTGTT 3 42 0 
TTTTAGCGTG GCAGTCTATT GTTGGATCCT CTTATTCCAA TTACATATAT GCCGACATCA 3 480 
50 CACACTTATG AATATTCCCT GTTTAAAAGA TTTTTATTTT AT AC C AATGT TTCTCCGTAA 3 540 
ATGATGCAAA CATGATAGAG ATGTTAGCAT GTCTTTCTTA ACCTACTCAT GTTTTACATA 3 600 
TCACGACAAG CTTCTTGCAG AAAATC AGC A GTATATGGCA AATTGCTGC A ACCTGACAAC 3 660 

55 

GTTTATATCT GTTTTCTAAC TCATACTGAC GGTGCAATTT CCTTTTAGTT TGGCCACCCA 3720 
GAATGGATTG ACTTTCCAGA AGAAGGCAAC AACTGGAGTT ATGATAAATG CAGACGCCAG 37 80 
60 TGGAGCCTCG CAGACATTGA TCACCTACGA TACAAGGTTA TGCCTATGTA TATTTTTACA 3 840 
GTTTCTGGTC TGGTAGCTCT CTTGGGATCT TGACCTCACT TAGTTCCTTC ATCTCTGACT 3900 
GTAGCTTATT TACACTGTGT TCCAACTTCT GTCTTGTGGA TAAATTCTCC CTTCTAACGT 3 9 60 

65 

TTCATATTAA GCCTTTCAAA CTAAACTAAA TTGCTGATCT ACTACTAGTT GCTCAGTACG 402 0 



WO 99/14314 

ATGACCAAAT CTTGCCTGTG GTAACCTAGT 
TGCAGTGCAT ACATTATCCA TATAAATTGA 
5 CTGTGTTCTT TTGTTAACAG GAAGTTATTT 
TCACGATTTT TCTCATATTT TATCCAACTT 
CTAACATATA TAATTTGAAC AGTACATGAA 

10 

CGACAAATTT TCCTTCCTAT CATC ATC AAA 
GAAGTAGTTA ACTATACAAT GTTTAGTCAG 
15 ACTCTTAAGA ATAGCAACTC TGACTTGTGC 
ATCTGTTTGA TATGAACCAT TGTTGTCTCA 
TTCCAGATTA TTGTATTTGA ACGTGGAATC 

20 

AAACTTATGA TGGGTAACTG ATCTCTTGCA 
ATGACTAATG TGCTTAATCT CGTTTCCACT 

2 5 GACTTGCCTG GGAAGTACAA GGTAGCTCTG 

GGAAGAGTAA GCAATGTTAA TGATGTTCAA 
AGAAGGGGCC ATCAAGGCTG CATCAGATAA 

30 

CGCAGGTGGC CCATGACAAC GATCACTTTA 
AAACAAACTT CAACAACCGC CCTAACTCAT 

3 5 TGGTAATGCT AATTACTAGG AGGATTTAGT 

TGCAGTACGA TCTCACAAAA TGCTCTCTTG 
GGAAAAGCCC AAGGATGAAG GAGCTGCTTT 

40 

TGTTGAAGCC ACTGGCGTCA AAGACGCAGC 
GGCGTCTACA GGAGGTGACT CCAGCAAGAA 
45 CAAAGACAAC AAATAAGCAC CATATCAACG 
TAATACTCCT GCTATTGCTA GTAGTAGCAA 
GGCTTGGACT TTGCTGAGGT TACCTACTAT 

50 

GGTCGAGTCC AGCTATATGT GCCAAATATG 
GTTTCGGGCT TCCATCCCAG AATAAAAACA 
55 CATAGTTACA TGATAATTGA TGC AT ATTGC 
ATAACTGCAG GGCCAAGAAA GCCTAGATTG 
GGGAAGCTTC AGTCCTTGTT TCCGTTCTCG 

60 

TAAGCCATCA TCTTATCAAG TCCCAAAATT 
TTCCAGGTGT TGGTTCCTCC ACAACCAAAA 
65 CACTGACCAT CGAAGCCACG GTGGGCATGA 
TCAAAATATC ACAAACTGCC ATGGCATCTT 
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AATTTTCTTG ATTCTTACAC ATTAGTGATA 4080 
C ATTGC AATT TCCCAAATAT TATTTGAAGG 4140 
TCTCTGCATC TGATAAATAA TAATAGCCTT 4200 
TTCTGCATTC AAGCATTTTT TGTTTCTCGC 42 60 
CGCATTTGAT CAAGCAATGA ATGCGCTCGA 4320 
GCAGATTGTC AGCGACATGA ATGAGGAAAA 43 80 
GGCAGCTGTT GCATCATTTG ATTCACTCCT 4440 
GTTTTATGTT ACCAAATAAG TTGAAACCGT 4500 
AAATGGGCTA TGGACTCAAT CCAACTTCCT 4560 
TGGTCTTCGT CTTCAATTTT CATCCCAGTA 4 620 
AGCTTTGCCT TTCAATATTT CTTCTGCTTA 4680 
TTTAAAACAC GCAGTTACAA AGTCGGATGT 4740 
GACTCTGATG CTCTGATGTT TGGTGGACAT 4 800 
GATCTGTTTT GCAACACTAT GTTCTTCTAT 4 860 
TCTTATTTGC AGTGTTGATC TGTGCTGCAT 4920 
CGTCACCTGA AGGAGTACCA GGAGTACCTG 49 80 
TCAAAATCCT GTCTCCATCC CGCACTTGTG 5040 
AACAATAAAT AAATAACAGC AAAAGATATC 5100 
CCAGGCTTAC TATCGCGTCG AGGAGAAAGC 5160 
CTTGGGGGAA ACTGCTCTCG GGTACATCGA 52 20 
AGATGGTGAG GCGACTTCTG GTTCCGAAAA 5280 
GGGAATTAAC TTTGTCTTTC TGTCACCCGA 53 40 
CTTGATCAGG ACCGTGTGCC GACGTCCTTG 5400 
TACTGTCAAA CTGTGCAGAC TTGAAATTCT 5460 
ATAGAAAGAT AAATAAGCGG TGATGGTGCG 552 0 
CGCCATCCCG AGTCCTCTGT C AT AAAG AAA 5580 
GTTGTCTGTT TGCAATTTCT TTTTGTCTTG 5640 
TATAAGCCTG GATTGCATCT TCTTTTGCTA 5700 
TATCTTTTTT TGCTAATAAC TGCAGTGCTG 5760 
AGACAAGGCG TCATGTTTGG CGCACAAAGG 5820 
CTCTGGTTGA AAGAAACCAT CACTAACTTG 5880 
GGCGACCATC GTCGTCATCA TCGCTCACAG 5940 
AATGCGCATC GCCCAAGACT TGGGACCGTT 6000 
CTGCCAAAGG CTGCACTGCA CCTTTGGCAT 6 060 
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GAACAGAAGC AACAGGGGCT TGGAACTGAA CGCCGAAAAT AAAGTCAAAC CGGCTGGGCC 6120 
GGATTGAAAG GGGAAACGCC AAAATCCACT TAATTTGAAT GGAAGGAGGA ATGGTTCTTG 6180 

5 

CTGGTTTCAA CTCTGCAGGC TTCCCTCTGA ATTTC AC ACG GAGCCATT 622 8 

(2) INFORMATION FOR SEQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTICS: 
1 0 (A) LENGTH: 1 1463 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: 

20 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

2 5 (ix) FEATURE: 

(A) NAME/KEY: misc_Jeature 

(B) LOCATION: 1.. 11463 

(D) OTHER INFORMATION :/product= "complete sequence of the 
starch branching enzyme II gene" 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

AGAAACACCT CCATTTTAGA TTTTTTTTTT GTTCTTTTCG GACGGTGGGT CGTGGAGAGA 6 0 

3 5 TTAGCGTCTA GTTTTCTTAA AAGAACAGGC CATTTAGGCC CTGCTTTACA AAAGGCTCAA 12 0 

CCAGTCCAAA ACGTCTGCTA GGATCACCAG CTGCAAAGTT AAGCGCGAGA CCACCAAAAC 180 

AGGCGCATTC GAACTGGACA GACGCTCACG CAGGAGCCCA GCACCACAGG CTTGAGCCTG 2 40 

40 

ACAGCGGACG TGAGTGCGTG ACACATGGGG TCATCTATGG GCGTCGGAGC AAGGAAGAGA 3 00 

GACGCACATG AACACCATGA TGATGCTATC AGGCCTGATG GAGGGAGCAA CCATGCACCT 3 60 

4 5 TTTCCCCTCT GG AAATTC AT AGCTCACACT TTTTTTTAAT GGAAGCAAGA GTTGGCAAAC 42 0 

ACATGCATTT TCAAACAAGG AAAATTAATT CTCAAACCAC CATGACATGC AATTCTCAAA 480 

CCATGCACCG ACGAGTCCAT GCGAGGTGGA AACGAAGAAC TGAAAATCAA CATCCCAGTT 540 

50 

GTCGAGTCGA GAAGAGGATG ACACTGAAAG TATGCGTATT ACG ATTTC AT TTACATACAT 600 

GTACAAATAC ATAATGTACC C T AC AATTTG TTTTTTGGAG CAGAGTGGTG TGGTCTTTTT 660 

55 TTTTTACACG AAAATGCCAT AGCTGGCCCG CATGCGTGCA GATCGGATGA TCGGTCGGAG 7 20 

ACGACGGACA ATCAGACACT CACCAACTGC TTTTGTCTGG GACACAATAA ATGTTTTTGT 7 80 

AAACAAAATA AATACTTATA AACGAGGGTA CTAGAGGCCG CTAACGGCAT GGCCAGGTAA 840 

60 

ACGCGCTCCC AGCCGTTGGT TTGCGATCTC GTCCTCCCGC ACGCAGCGTC GCCTCCACCG 900 
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TCCGTCCGTC GCTGCCACCT CTGCTGTGCG 
CACACACACT CACACACGGC ACACTCCCCG 
5 CCTCTCCCCC GCCCATCCCC ATGCACTGCA 
CACGTTGCTC CCCCTTCTCA TCGCTTCTCA 
TGCATTTCGG CCGGCGGGTT GAGTGAGATC 

10 

GGGATGGCGA CGTTCGCGGT GTCCGGCGCG 
GTGGCGCGGG CCGGCTCGGA GCGGAGGGGC 
15 AAGAAGGACT CCTCTCGTAC GCCTCGCTCT 
CCTTCTCTCT CCTCTGCGCG CGCATGGCCT 
GAGTGAGAGA GATAGCTGGA TTAGGCGATC 

20 

CGGGGAAATG CGTTAGTGTC ACCCAGGCCC 
TTTCATTCTG ATATATATTT TCTCATTCTT 

2 5 TGTGGCGTTT TTTCACTATT GTAGTCATCC 

GCGGCCTCTC CAGGGAAGGT CCTGGTGCCT 
CGCAACCTGA AGAATTACAG GTACACACAC 

30 

TTCACTTACC AAATGCCGGA TGAAACCAAC 
ATCAGCATTG TGCAGTACTG CACTGCCTTG 

3 5 CTCTTGGGCC ACTGAAAAAA TCAGATGGAT 

TGCACCGTTT GGGGTTTCGT CAGTCTGCTC 
CTGAAGATAT CGAGGAGCAA ACGGCGGAAG 

40 

TTCAATCTTC AGAACCGACT CAGGGCATTG 
GAGTTAAGGA ACTAGTCGTG GGGGAGAAAC 
45 AGAAAATATA CGAGATTGAC CCAACACTGA 
AATGCCTACC CGCTGCTTTC GCTCATTTTG 
GGAACATCAA AGAGACAAAG ACTAGGGACC 

50 

AGAATATGCT GGGAAGTAAA TGTATAATTG 
GAATAACTGT CTCCGATCAT TACAATTAAA 
55 GGGTTATAGA TTTTACTTTG CTAATTCCTC 
TTGGGAAACT TAGTTTCTTA TCTTTGTGGC 
ATTCGAATGA TTTTGGGTAT ACCTCGGTGG 

60 

TCGTGCTGCT ATTGACCAAC ATGAAGGTGG 
GCTTGGATTT ACCCGC AGGT AAATTTAAAG 
65 TAATTGCATA TCTTATAAGA AAATTTATAA 
CTGAAGGTAT CGTCTAATTG CATATCTTAT 
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CGCGCACGAA GGGAGGAAGA ACGAACGCCG 9 60 
TGGGTCCCCT TTCCGGCTTG GCGTCTATCT 1020 
CCGT ACCCGC CAGCTTCCAC CCCCGCCGCA 1080 
ATTAATATCT CCATCACTCG GGTTCCGCGC 1140 
TGGGCGACTG GCTGACTCAA TCACTACGCG 12 00 
ACTCTCGGTG TGGCGCGGGC CGGCGTCGGA 1260 
GGGGCGGACT TGCCGTCGCT GCTCCTCAGG 13 20 
CTCGAATCTC CCCCGTCTGG CTTTGGCTCC 13 80 
GTTCGATGCT GTTCCCCAAT TGATCTCCAT 1440 
GCGCTTCCTG AACCTGTATT TTTTCCCCCG 1500 
TGGTGTTACC ACGGCTTTGA TCATTCCTCG 1560 
TTTCTTCCTG TTCTTGCTGT AACTGCAAGT 162 0 
TTGCATTTTG CAGGCGCCGT CCTGAGCCGC 1680 
GACGGCGAGA GGACGACTTG GCAAGTCCGG 1740 
TCGTGCCGGT AAATC TTC AT ACAATCGTTA 1800 
CACGGATGCG TCAGGTTTCG AGCTTCTTCT 1860 
TTCATTTTGT TAGCCTTGGC CCCGTGCTGG 1920 
GTGCATTCTA GCAAGAACTT CACAACATAA 1980 
T AC AATTGC T ATTTTTCGTG CTGTAGATAC .2 040 
TGAACATGAC AGGGGGGACT GCAGAGAAAC 2100 
TGGAAACAAT CACTGATGGT GTAACCAAAG 2160 
CGCGAGTTGT CCCAAAACCA GGAGATGGGC 2220 
AAGATTTTCG GAGCCATCTT GACTACCGGT 2280 
AATTAAGGTC CTTTCATCAT GCAAATTTGG 2340 
ACCATTTCAT ACAGATCCCT TCGTGGTCTG 2400 
ATGGCTACAA TTTGCTCAAA ATTGCAATAC 2460 
GAGTGGCAAA CTGATGAAAA TGTGGTGGAT 2 520 
TACCAAATTC CTAGGGGGGA AATCTACCAG 2 580 
CTTTTTGTTT TGGGGAAAAC ACATTGCTAA 2640 
ATTCAACAGA TACAGCGAAT ACAAGAGAAT 2700 
ATTGGAAGCA TTTTC TCGTG GTTATGAAAA 27 60 
CTTTATTATT ATGAAACGCC TCCACTAGTC 2820 
TTCCTGTTTT CCCCTCTCTT TTTTCCAGTG 2880 
AAGAAAATTT ATATTC CTGT TTTCCCCTAT 2940 
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TTTCCAGTGC TGAAGGTATC ACTTACCGAG AATGGGCTCC CTGGAGCGCA TGTTATGTTC 3000 
TTTTAAGTTC CTTAACGAGA CACCTTCCAA TTTATTGTTA ATGGTCACTA TTCACCAACT 3060 

5 

AGCTTACTGG ACTTACAAAT TAGCTTACTG AATACTGACC AGTT AC TATA AATTTATGAT 3120 
CTGGCTTTTG CACCCTGTTA CAGTCTGCAG CATTAGTAGG TGACTTCAAC AATTGGAATC 3180 
10 CAAATGCAGA TACTATGACC AGAGTATGTC TACAGCTTGG CAATTTTCCA CCTTTGCTTC 3 2 40 
ATAACTACTG ATACATCTAT TTGTATTTAT TTAGCTGTTT GCACATTCCT TAAAGTTGAG 3 3 00 
CCTCAACTAC ATCATATCAA AATGGTATAA TTTGTCAGTG TCTTAAGCTT CAGCCCAAAG 3 3 60 

15 

ATTCTACTGA ATTTAGTCCA TCTTTTTGAG ATTGAAAATG AGTATATTAA GGATGAATGA 3 420 
ATACGTGCAA CACTCCCATC TGCATTATGT GTGCTTTTCC ATCTACAATG AGCATATTTC 34 80 

2 0 CATGCTATCA GTGAAGGTTT GCTCCTATTG ATGCAGATAT TTGATATGGT CTTTTCAGGA 3 540 

TGATTATGGT GTTTGGGAGA TTTTCCTCCC TAACAACGCT GATGGATCCT CAGCTATTCC 3 600 
TCATGGCTCA CGTGTAAAGG TAAGCTGGCC AATTATTTAG TCGAGGATGT AGCATTTTCG 3 660 

25 

AACTCTGCCT ACTAAGGGTC CCTTTTCCTC TCTGTTTTTT AGATACGGAT GG AT ACTCC A 3720 
TCCGGTGTGA AGGATTCAAT TTCTGCTTGG ATCAAGTTCT CTGTGCAGGC TCCAGGTGAA 37 80 

3 0 ATACCTTTCA ATGGCATATA TTATGATCCA CCTGAAGAGG TAAGTATCGA TCTACATTAC 3 840 

ATTATTAAAT GAAATTTCCA GTGTTACAGT TTTTTAATAC CCACTTCTTA CTGACATGTG 3 9 00 
AGTCAAGACA ATACTTTTGA AT TTGG AAGT GACATATGCA TTAATTCACC TTCTAAGGGC 3 960 

35 

TAAGGGGCAA CCAACCTTGG TGATGTGTGT ATGCTTGTGT GTGACATAAG ATCTTATAGC 4 020 
TCTTTTATGT GTTCTCTGTT GGTTAGGATA TTCCATTTTG GCCTTTTGTG ACCATTTACT 4080 

4 0 AAGGATATTT ACATGCAAAT GCAGGAGAAG TATGTCTTCC AACATCTCAA CTAAACGACC 4140 

AGAGTCACTA AGGATTTATG AATCACACAT TGGAATGAGC AGCCCGGTAT GTCAATAAGT 42 00 
TATTTCACCT GTTTCTGGTC TGATGGTTTA TTCTATGGAT TTTCTAGTTC TGTTATGTAC 4 2 60 

45 

TGTTAACATA TTACATGGTG CATTCACTTG ACAACCTCGA TTTTATTTTC TAATGTCTTC 43 20 
ATATTGGCAA GTGCAAAACT TTGCTTCCTC TTTGTCTGCT TGTTCTTTTG TCTTCTGTAA 43 80 
50 GATTTCCATT GCATTTGGAG GCAGTGGGCA TGTGAAAGTC ATATCTATTT TTTTTTTGTC 4440 
AGAGCATAGT TATATGAATT CCATTGTTGT TGCAATAGCT CGGTATAATG TAACCATGTT 4 5 00 
ACTAGCTTAA GATTTCCCAC TTAGGATGTA AGAAATATTG CATTGGAGCG TCTCCAGCAA 4560 

55 

GCCATTTCCT ACCTTATTAA TGAGAGAGAG ACAAGGGGGG GGGGGGGGGG GGGGTTCCCT 4 62 0 
TCATTATTCT GCGAGCGATT CAAAAACTTC CATTGTTCTG AGGTGTACGT ACTGCAGGGA 4 680 
60 TCTCCCATTA TGAAGAGGAT ATAGTTAATT CTTTGTAACC TACTTGGAAA CTTGAGTCTT 47 40 
GAGGCATCGC TAATATATAC TATCATCACA ATACTTAGAG GATGCATCTG AAATTTTAGT 4800 
GTGATCTTGC ACAGGAACCG AAG AT AAATT CATATGCTAA TTTTAGGGAT GAGGTGTTGC 4860 

65 

CAAGAATTAA AAGGCTTGGA TACAATGCAG TGCAGATAAT GGC AATCC AG GAGCATTCAT 49 20 
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ACTATGCAAG CTTTGGGTAT TCACACAATC 
TTGGAGCTAT TACATCCTAA TGCTTCATGC 
5 AGATATATAG TACAACTACA CTTAGTATTC 
GTTCCAGGTA CCATGTTACT AATTTTTTTG 
ACTTAAAATC CTTGATCGAT AGAGCACATG 

10 

TTCATAGGTA ATTAGTCCAA TTTAATTTTA 
GGGAAATTCA GGCAATTATG ATACATTGTC 
15 AAAATC TAG A GTGGCATAAG GAAAATTGGC 
CCATCCTAAA TGGCAGGGCC CTATCGCCGA 
GTGACTTCTT TTTTCTCAGA TGTATTAAAC 

20 

TAGTAAACTG ACAGTTCCAT AGAATATCGT 
GATGTGGATT GAGAAGTTCA GATGCTATCA 
25 GGCACTACAT ATAGTTTGCA AGTTGGAAAA 
AGGCCCCACT TGCCAGCTTC ATACTAGATG 
TTACTTAAAG TTCTTCATTT GTCCTAAGTC 

30 

GAAAATATAT CAACATCTAC AACACCAAAT 
ATTATATTAG CACATCTTTG ATGTTGTAGA 

3 5 AATATAGAGA AGTTTG AC TT AGGACAAATC 

ACATCAAATA ATATAGATAG ATGTCAACAC 
ATATGCATCA GACCATCTGT TTGCTTTAGC 

40 

TAATCTACTT TTCCTTCTAC TTGGTTTGGT 
ATGATTTTGT GTACCCTGCA GTCATTCGTC 

4 5 CGATGGCACT GATACACATT ACTTCCACGG 

TTCTCGTCTA TTCAACTATG GGAGTTGGGA 
TTGGCTAACT GTTCCTGTTA ATCTGTTCTT 

50 

TATTGAGATT CTTACTGTCA AACGCGAGAT 
TTCGATTTGA TGGGGTGACC TCCATGATGT 
55 AAGTGGTTTC AGTAACTTTT TTAGGGCACT 
CATGATCAGG ACTTGTGCTA CGGAGTCTTA 
ACCTGATGAG ATCATGGAAG ATTGGAAGTG 

60 

TCTTGTTCTA GATGACATTT ACTGGGAACT 
TTGATGCGGT AGTTTACTTG ATGCTGGTCA 
.65 CTGTATCCAT TGGTGAAGAT GTAAGTGCTT 
AGTTTTATTT TGGGGATCAG TCTGTTACAC 
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TO _ 

CATTTTTTTC TGTATACACT CTTCACCCAT 4980 
ACATAAAATA TTTGGATATA ATCCTTTATT 5040 
TGAAAAAGAT CATTTTATTG TTGTTGGCTT 5100 
CACCAAGTAG CCGTTTTGGA ACTCCAGAGG 5160 
AGCTTGGTTT GCTTGTTCTT ATGGATATTG 522 0 
GCTGTTTTAC TGTTTATCTG GTATTCTAAA 52 80 
AAAAGCTAAG AGTGGCGAAA GTGAAATGTC 53 40 
AAAAACTAGA GTGGCAAAAA TAAAATTTTC 5400 
ATATTTTTCC ATTCTATATA ATTGTGCTAC 5460 
CAGTTGGACA TGAAATGTAT TTGGTACATG 552 0 
TTTGTAATGG CAACACAATT TGATGCCATA 5580 
ATAGAATTAA TCAACTGGCC ATGTACTCGT 5640 
CTGACAGCAA TACCTCACTG ATAAGTGGCC 57 00 
TTACTTCCCT GTTGAATTCA TTTGAACATA 57 60 
AAACTTCTTT AAGTTTGACC AAGTCTATTG 5 820 
TACTTTGATC AGATTAACAA TTTTTATTTT 5 8 80 
TATCAGCACA TTTTTCTATA GACTTGGTCA 5940 
TAGAACTTCA ATC AATTTGG ATCAGAGGGA 6000 
TTCAACAAAA AAATCAGACC TTGTCACCAT 60 60 
CACTTGCTTT CATATTTATG TGTTTGTACC 612 0 
TGATTCTATT TCAGTTGCAT TGCTTCATCA 6180 
AAATAATACC CTTGACGGTT TGAATGGTTT 62 40 
TGGTCCACGC GGCCATCATT GGATGTGGGA 63 0 0 
AGTATGTAGC TCTGACTTCT GTCACCATAT 63 60 
ACACATGTTG ATATTCTATT CTTATGCAGG 642 0 
GGTGGCTTGA AGAATATAAG TTTGATGGAT 6480 
ATACTCACCA TGGATTACAA GTAAGTCATC 6540 
GAAACAATTG CTATGCATCA TAACATGTAT 6600 
GATAGTTCCC TAGTATGCTT GTACAATTTT 6660 
ATTATTATTT ATTTTCTTTC TAAGTTTGTT 67 20 
ATGGCGAATA TTTTGG ATTT GCTACTGATG 67 80 
ACGATCTAAT TCATGGACTT TATCCTGATG 6840 
ACAGTATTTA TGATTTTTAA CTAGTTAAGT 6900 
TTTTTGTTAG GGGT AAAATC TCTCTTTTCA 69 60 
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TAACAATGCT AATTTATACC TTGTATGATA ATGCATCACT TAGTAATTTG AAAAGTGCAA 7 02 0 
GGGCATTCAA GCTTACGAGC ATATTTTTTG ATGGCTGTAA TTTATTTGAT AGTATGCTTG 7 080 

5 

TTTGGGTTTT TCAATAAGTG GGAGTGTGTG ACTAATGTTG TATTATTTAT TTAATTGCGG 7140 
AAGAAATGGG CAACCTTGTC AATTGCTTCA GAAGGCTAAC TTTGATTCCA TAAACGCTTT 7 200 
10 GGAAATGAGA GGCTATTCCC AAGGACATGA ATTATACTTC AGTGTGTTCT GTACATGTAT 72 60 
TTGTAATAGT GGTTTAACTT AAATTCCTGC ACTGCTATGG AATCTCACTG TATGTTGTAG 73 20 
TGTACACATC CACAAACAAG TAATCCTGAG CTTTCAACTC ATGAGAAAAT AGAGTCCGCT 73 8 0 

15 

TCTGCCAGCA TTAACTGTTC ACAGTTCTAA TTTGTGTAAC TGTGAAATTG TTCAGGTCAG 7 44 0 
TGGAATGCCT ACATTTTGCA TCCCTGTTCC AGATGGTGGT GTTGGTTTTG ACTACCGCCT 7 500 

2 0 GCATATGGCT GTAGCAGATA AATGGATTGA ACTCCTCAAG TAAGTGCAGG AATATTGGTG 7 560 

ATTACATGCG CACAATGATC TAGATTACAT TTTCTAAATG GTAAAAAGGA AAATATGTAT 7 62 0 
GTGAATATCT AGACATTTGC CTGTTATCAG CTTGAATACG AGAAGTCAAA TACATGATTT 7 6 80 

25 

AAATAGCAAA TCTCGGAAAT GTAATGGCTA GTGTCTTTAT GCTGGGCAGT GTACATTGCG 77 4 0 
CTGTAGCAGG CCAGTCAACA CAGTTAGCAA TATTTTCAGA AACAATATTA TTTATATCCG 7 800 

3 0 TATATGAGAA AGTTAGTATA TAAACTGTGG TCATTAATTG TGTTCACCTT TTGTCCTGTT 7 86 0 

TAAGGATGGG CAGTAGGTAA TAAATTTAGC CAGATAAAAT AAATCGTTAT TAGGTTTACA 7 9 20 
AAAGGAATAT ACAGGGTCAT GTAGCATATC TAGTTGTAAT TAATGAAAAG GCTGACAAAA 79 80 

35 

GGCTCGGTAA AAAAAACTTT ATGATGATCC AGATAGATAT GCAGGAACGC GACTAAAGCT 8040 
CAAATACTTA TTGCTACTAC ACAGCTGCCA ATCTGTCATG ATCTGTGTTC TGCTTTGTGC 8100 

4 0 TATTTAGATT TAAATACTAA CTCGATACAT TGGCAATAAT AAACTTAACT ATTCAACCAA 8160 

TTTGGTGGAT ACCAGAATTT CTGCCCTCTT GTTAGTAATG ATGTGCTCCC TGCTGCTGTT 82 2 0 
CTCTGCCGTT ACAAAAGCTG TTTTCAGTTT TTTGCATCAT TATTTTTGTG TGTGAGTAGT 82 80 

45 

TTAAGCATGT TTTTTGAAGC TGTGAGCTGT TGGTACTTAA TACATTCTTG GAAGTGTCCA 8 3 40 
AATATGCTGC AGTGTAATTT AGCATTTCTT TAACACAGGC AAAGTGACGA ATCTTGGAAA 8400 
50 ATGGGCGATA TTGTGCACAC CCTAACAAAT AGAAGGTGGC TTGAGAAGTG TGTAACTTAT 8460 
GCAGAAAGTC ATGATCAAGC ACTAGTTGGT GACAAGACTA TTGCATTCTG GTTGATGGAT 8520 
AAGGTACTAG CTGTTACTTT TGG AC AAAAG AATTACTCCC TCCCGTTCCT AAATATAAGT 8580 

55 

C TTTGT AG AG ATTCCACTAT GGACCACATA GTATATAGAT GCATTTTAGA GTGTAGATTC 8640 
ACTCATTTTG CTTCGTATGT AGTCCATAGT GAAATCTCTA CAGAGACTTA TATTTAGGAA 8700 
60 CGGAGGGAGT ACATAATTGA TTTGTCTCAT CAGATTGCTA GTGTTTTCTT GTGATAAAGA 87 60 
TTGGCTGCCT CACCCATCAC CAGCTATTTC CCAACTGTTA CTTGAGCAGA ATTTGCTGAA 8820 
AACGTACCAT GTGGTACTGT GGCGGCTTGT GAACTTTGAC AGTTATGTTG CAATTTTCTG 8880 

65 

TTCTTATTTA TTTGATTGCT TATGTTACCG TTCATTTGCT CATTCCTTTC CGAGACCAGC 8940 
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CAAAGTCACG TGTTAGCTGT GTGATCTGTT 
GCTAAAATCC AACGAATTAT TTGCTTGAAT 
5 CTTTCTTAGA TGATTACCAT AGTGCCTGAA 
CCGCCTAAAG GAGTGATTTT TATTGGATAG 
CATTTTGGAG ATATGCTTAG TAACAGCTCT 

10 

CGCTCCTTGA GGTTTTATTA TGGCGCCATC 
CATTCAAAAG GAAACGGTCA CATCATTCTA 
15 TGTTCCAATT TTATGAGTTT TTGGGACTCC 
TATAACTACA GTTGTTTTTA TACCAGTGTA 
C TGTGCTGT A AATTATTTAT C CG AC AT AG A 

20 

CAGGATATGT ATGATTTCAT GGCTCTGGAT 
TAGCATTACA TAAAATGATC AGGCTTGTCA 

2 5 ACTTCATGGG AAATGAGTTT GGGCATCCTG 

ATGATTGTGA TTTACTGTAA TTTGAACCAT 
ATCTGTTGCT TCCAAGGAGG AAGTTAACTT 

30 

CAAGAGGCCC ACAAACTCTT CCAACCGGCA 
GATAAATGCC GCCGTAGATT TGATCTTGTA 

3 5 TAGATCTTTA TTGGCCATTT ATTTCTTGAT 

CATTGCTTTT GTAGTTTTGT AGACGTTAAC 
AAAAAT ATC A TGATTTTTTG CAGGGAGATG 

40 

AGTTCGATCA GGCAATGCAG CATCTTGAGG 
TTGTTGCATA ACAAGTCACA GTTTAACGTC 
45 ATTAATTCCT GTAATGAGAT GAAAACTGTG 
AAACTATTTT CTTAAGTGCT TGTGTATTGA 
AGTTTATGAC ATCTGAGCAC CAGTATGTTT 

50 

TCCTCAAAAG AGGAGATTTG GTATTTGTTT 
ACTACCGTGT TGGGTGTTCC AAGCCTGGGA 
55 CACCCTTCAC CAGTAGGGTT AGTGGGGGCT 
TTTGTTGGTC GTGCAGCTAT CAATATAAAG 
CTCGAGCTGT TGTAGCCATA GGAAGGTTGT 

60 

TTCATATTAT CTACTTAAGT GTTTGTTTCA 
ACTAGAACTA TTTTCCGAAT CTACCCTAAC 
65 GGACAATTGG CTGGGTTTTT GTTAGTTGTG 
CTTGGACTCT GACGATGCAC TCTTTGGTGG 
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ATCTGAATCT TGAGCAAATT TTATTAATAG 9000 
TTAAATATAC AGACGTATAG TCACCTGGCT 9060 
GGCTGAAATA GTTTTGGTGT TTCTTGGATG 9120 
ATTCC TGGCC GAGTCTTCGT TACAACATAA 9180 
GGGAAGTTTG GTCACAAGTC TGCATCTACA 9240 
TTTGTAACTA GTGGCACCTG TAAGGAAACA 9 3 00 
ATCAGGACCA CCATACTAAG AGCAAGATTC 9 3 60 
AAAGGG AAC A AAAGTGTCTC ATATTGTGCT 9420 
GTTTTATTCC AGG AC AGTTG ATACTTGGTA 9 480 
ACAGCATGAA CATATCAAGC TCTCTTTGTG 9 540 
AGGCTTCAAC TCTTCGCATT GATCGTGGCA 9600 
CCATGGGTTT AGGTGGTGAA GGCTATCTTA 9 6 60 
GTCAGTCTTT ACAACATTAT TGCATTCTGC 97 2 0 
GCTTTTCTTT CACATTGTAT GTATTATGTA 9780 
CTATTTACTT GGCAGAATGG ATAGATTTTC 9 840 
AAGTTCTCCC CTGGAAATAA CAATAGTTAT 9900 
AGTTTTAGCT GTGCTATTAC ATTCCCTCAC 9960 
GAAATCATAA TGTTTGTTAG GAAAGATCAA 10020 
ATAAGTATGT GTTGAGAGTT GTTGATCATT 10080 
CAGATTTTCT TAGATATCGT GGTATGCAAG 10140 
AAAAAT ATGG GGTATGTCAC TGGTTTGTCT 10200 
AGTCTCTTCA AGTGGTAAAA AAAGTGTAGA 10260 
CAAAGGCGGA GCTGGAATTG CTTTTCACCA 10320 
T AC AT AT AC C AGC AC TG AC A ATGTAACTGC 103 80 
CACGGAAACA TGAGGAAGAT AAGGTGATCA 10440 
TCAACTTCCA CTGGAGCAAT AGCTTTTTTG 10500 
AGTACAAGGT ATGCTTGCCT TTTCATTGTC 10560 
TCTACAACTT TTAATTCCAC ATGGATAGAG 10620 
AATAGGGTAA TTTGTAAAGA AAAGAATTTG 10680 
TCTTAACAGC CCCGAAGCAC ATACCATTCA 10740 
ATCTTTATGC TCAGTTGGAC TCGGTCTAAT 10800 
CATCCTAGCA GTTTTAGAGC AGCCCCATTT 10860 
ACAGTTTCTG CTATTTCTTA ATCAGGTGGC 10920 
ATTCAGCAGG CTTGATCATG ATGTCGACTA 109 80 
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CTTCACAACC GTAAGTCTGG GCTCAAGCGT CACTTGACTC GTCTTGACTC AACTGCTTAC 11040 
AAATCTGAAT CAACTTCCCA ATTGCTGATG CCCTTGCAGG AACATCCGCA TGACAACAGG 11100 
5 CCGCGCTCTT TCTCGGTGTA CACTCCGAGC AGAACTGCGG TCGTGTATGC CCTTACAGAG 11160 
TAAGAACCAG CAGCGGCTTG TTACAAGGCA AAGAGAGAAC TCCAGAGAGC TCGTGGATCG 11220 
TGAGCGAAGC GACGGGCAAC GGCGCGAGGC TGCTCCAAGC GCCATGACTG GGAGGGGATC 11280 

10 

GTGCCTCTTC CCCAGATGCC AGGAGGAGCA GATGGATAGG TAGCTTGTTG GTGAGCGCTC 113 40 
GAAAGAAAAT GGACGGGCCT GGGTGTTTGT TGTGCTGCAC TGAACCCTCC TCCTATCTTG 11400 
15 CACATTCCCG GTTGTTTTTG TACATATAAC TAATAATTGC CCGTGCGCTC AACGTGAAAA 114 60 
TCC 11463 

(2) INFORMATION FOR SEQ ID NO: 1 1: 

2 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2662 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

3 0 (iv) ANTI-SENSE: 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

35 

(ix) FEATURE: 

(A) NAME/ KEY: misc_feature 

(B) LOCATION: 1.. 2651 

(D) OTHER INFORMATION:/product= "nucleotide sequence of 
40 cDNA wheat SSS I" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1 1: 

TCTCCCACTC TTCTCTCCCC GCGCACACCG AGTCGGCACC GGCTCATCAC CCATCACCTC 60 

45 

GGCCTCGGCC ACCGGCAAAC CCCCCGATCC GCTTTTGCAG GCAGCGCACT AAAACCCCGG 120 

GGAGCGCGCC CCGCGGCAGC AGCAGCACCG CAGTGGGAGA GAGAGGCTTC GCCCCGGCCC 180 

50 GCACCGAGCG GGGCGATCCA CCGTCCGTGC GTCCGCACCT CCTCCGCCTC CTCCCCTGTC 240 

CCGCGCGCCC ACACCCATGG CGGCGACGGG CGTCGGCGCC GGGTGCCTCG CCCCCAGCGT 300 

CCGCCTGCGC GCCGATCCGG CGACGGCGGC CCGGGCGTCC GCCTGCGTCG TCCGCGCGCG 3 60 

55 

GCTCCGGCGC TTGGCGCGGG GCCGCTACGT TGCCGAGCTC AGCAGGGAGG GCCCCGCGGC 420 

GCGCCCCGCG CAGCAGCAGC AACTGGCCCC GCCGCTCGTG CCAGGCTTCC TCGCGCCGCC 480 

60 GCCGCCCGCG CCCGCCCAGT CGCCGGCCCC GACGCAGCCG CCCCTGCCGG ACGCCGGCGT 540 

GGGGGAACTC GCGCCCGACC TCCTGCTCGA AGGGATTGCT GAGGATTCCA TCGACAGCAT 600 
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AATTGTGGCT GCAAGTGAGC AGGATTCTGA GATCATGGAT GCGAATGAGC AACCTCAAGC 660 
TAAAGTTACA CGTAGCATCG TGTTTGTGAC TGGTGAAGCT GCTCCTTATG CAAAGTCAGG 72 0 

5 

GGGGCTGGGA GATGTTTGTG GTTCGTTACC AATTGCTCTT GCTGCTCGTG GTCACCGTGT 780 
GATGGTTGTA ATGCCAAGAT ACTTGAATGG GTCCTCTGAT AAAAACTATG CAAAGGCATT 84 0 
10 ATACACTGGG AAGC AC ATT A AGATTCCATG CTTTGGGGGA TCACATGAAG TGACCTTTTT 900 
TCATGAGTAT AGAGACAACG TCGATTGGGT GTTTGTCGAT CATCCGTCAT ATCATAGACC 960 
AGGAAGTTTA TATGGAGATA ATTTTGGTGC TTTTGGTGAT AATCAGTTCA GATACACACT 102 0 

15 

CCTTTGCTAT GCTGCATGCG AGGCCCCACT AATCCTTGAA TTGGGAGGAT ATATTTATGG 1080 
ACAGAATTGC ATGTTTGTTG TGAACGATTG GCATGCCAGC CTTGTGCCAG TCCTTCTTGC 1140 

2 0 TGC AAAAT AT AGACCATACG GTGTTTACAG AGATTCCCGC AGCACCCTTG TTATACATAA 1200 

TTTAGCACAT CAGGGTCTGG AGCCTGCAAG TACATATCCT GATCTGGGAT TGCCACCTGA 1260 
ATGGTATGGA GCTTTAGAAT GGGTATTTCC AGAATGGGCA AGGAGGCATG CCCTTGACAA 13 2 0 

25 

GGGTGAGGCA GTTAACTTTT TGAAAGGAGC AGTCGTGACA GCAGATCGAA TTGTGACCGT 13 80 
CAGTCAGGGT TATTCATGGG AGGTCACAAC TGCTGAAGGT GGACAGGGCC TCAATGAGCT 1440 

3 0 CTTAAGCTCC CGAAAAAGTG TATTGAATGG AATTGTAAAT GGAATTGACA TTAATGATTG 1500 

GAACCCCACC ACAGACAAGT GTCTCCCTCA TCATTATTCT GTCGATGACC TCTCTGGAAA 156 0 
GGCCAAATGT AAAGCTGAAT TGCAGAAGGA GCTGGGTTTA CCTGTAAGGG AGGATGTTCC 162 0 

35 

TCTGATTGGC TTTATTGGAA GACTGGATTA CCAGAAAGGC ATTGATCTCA TTAAAATGGC 1680 
CATTCCAGAG CTCATGAGGG AGGACGTGCA GTTTGTCATG CTTGGATCTG GGGATCCAAT 1740 

4 0 TTTTGAAGGC TGGATGAGAT CTACCGAGTC GAGTTACAAG GATAAATTCC GTGGATGGGT 1800 

TGGATTTAGT GTTCCAGTTT CCCACAGAAT AACTGCAGGT TGCGATATAT TGTTAATGCC 1860 
ATCCAGGTTT GAACCTTGTG GTCTTAATCA GCTATATGCT ATGCAATATG GTACAGTTCC 1920 

45 

TGTAGTTCAT GGAACTGGGG GCCTCCGAGA CACAGTCGAG ACCTTCAACC CTTTTGGTGC 19 80 
AAAAGGAGAG GAGGGTACAG GGTGGGCGTT CTCACCGCTA ACCGTGGACA AGATGTTGTG 2 040 
50 GGCATTGCGA ACCGCGATGT CGACATTCAG GGAGCACAAG CCGTCCTGGG AGGGGCTCAT 2100 
GAAGCGAGGC ATGACGAAAG AC C AT ACGTG GGACCATGCC GCCGAGCAGT ACGAGCAGAT 2160 
CTTCGAATGG GCC TTCGTGG ACCAACCCTA CGTCATGTAG ACGGGGACTG GGGAGGTCGA 2 22 0 

55 

AGCGCGGGTC TCCTTGAGCT CTGAAGACAT GTTCCTCATC CTTCCGCGGC CCGGAAGGAT 22 80 
ACCCCTGTAC ATTGCGTTGT CCTGCTACAG TAGAGTCGCA ATGCGCCTGC TTGCTTGGTC 23 40 
60 CGCCGGTTCG AGAGTAGATG ACGGCTGTGC TGCTGCGGCG GTG AC AGCTT CGGGTGGATG 2400 
ACAGTTACAG TTTTGGGGAA TAAGGAAGGG ATGTGCTGCA GGATGGTTAA CAGCAAAGCA 2460 
CCACTCAGAT GGCAGCCTCT CTGTCCGTGT TACAGCTGAA ATCAGAAACC AACTGGTGAC 2 52 0 

65 

TCTTTAGCCT TAGCGATTGT GAAGTTTGTT GCATTCTGTG TATGTTGTCT TGTCCTTAGC 2 5 80 
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TGACAAATAT TAGACCTGTT GGAGAATTTT ATTTATCTTT GCTGCTGTTG TTTTTGTTTT 2 640 
GTTAAAAAAA AAAAAAAAAA AA 2662 

5 (2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 768 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
1 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 



15 



(ix) FEATURE: 
2 0 (A) NAME/KEY: Protein 
(B) LOCATION: 1 ..768 

(ix) FEATURE: 
(A) NAME/KEY: Protein 
2 5 (B) LOCATION: 1 ..768 

(D) OTHER INFORMATION:/product= "deduced amino acid 
sequence SBE II" 



30 



45 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Ala Thr Phe Ala Val Ser Gly Ala Thr Leu Gly Val Ala Arg Pro 
15 10 15 



Pro Ala Ala Ala Gin Pro Glu Glu Leu Gin lie Pro Glu Asp lie Glu 
35 20 25 30 

Glu Gin Thr Ala Glu Val Asn Met Thr Gly Gly Thr Ala Glu Lys Leu 
35 40 45 

40 Glu Ser Ser Glu Pro Thr Gin Gly lie Val Glu Thr lie Thr Asp Gly 

50 55 60 



Val Thr Lys Gly Val Lys Glu Leu Val Val Gly Glu Lys Pro Arg Val 
65 70 75 80 

Val Pro Lys Pro Gly Asp Gly Gin Lys lie Tyr Glu lie Asp Pro Thr 
85 90 95 



Leu Lys Asp Phe Arg Ser His Leu Asp Tyr Arg Tyr Ser Glu Tyr Arg 
50 100 105 110 

Arg He Arg Ala Ala He Asp Gin His Glu Gly Gly Leu Glu Ala Phe 
115 120 125 

55 Ser Arg Gly Tyr Glu Lys Leu Gly Phe Thr Arg Ser Ala Glu Gly He 

130 ' 135 140 



Thr Tyr Arg Glu Trp Ala Pro Gly Ala His Ser Ala Ala Leu Val Gly 
145 150 155 160 



60 



PCT/AU98/00743 



_ OA 

V* — 

Asp Phe Asn Asn Trp Asn Pro Asn Ala Asp Thr Met Thr Arg Asp Asp 
165 170 175 

Tyr Gly Val Trp Glu lie Phe Leu Pro Asn Asn Ala Asp Gly Ser Pro 
180 • 185 190 

Ala lie Pro His Gly Ser Arg Val Lys lie Arg Met Asp Thr Pro Ser 
195 200 205 

Gly Val Lys Asp Ser lie Ser Ala Trp lie Lys Phe Ser Val Gin Ala 
- 210 215 220 

Pro Gly Glu lie Pro Phe Asn Gly lie Tyr Tyr Asp Pro Pro Glu Glu 
225 230 235 240 

Glu Lys Tyr Val Phe Gin His Pro Gin Pro Lys Arg Pro Glu Ser Leu 
245 250 255 

Arg lie Tyr Glu Ser His lie Gly Met Ser Ser Pro Glu Pro Lys lie 
260 265 270 

Asn Ser Tyr Ala Asn Phe Arg Asp Glu Val Leu Pro Arg lie Lys Arg 
275 280 285 

Leu Gly Tyr Asn Ala Val Gin lie Met Ala lie Gin Glu His Ser Tyr 
290 295 300 

Tyr Ala Ser Phe Gly Tyr His Val Thr Asn Phe Phe Ala Pro Ser Ser 
305 310 315 320 

Arg Phe Gly Thr Pro Glu Asp Leu Lys Ser Leu lie Asp Arg Ala His 
325 330 335 

Glu Leu Gly Leu Leu Val Leu Met Asp lie Val His Ser His Ser Ser 
340 345 350 

Asn Asn Thr Leu Asp Gly Leu Asn Gly Phe Asp Gly Thr Asp Thr His 
355 360 365 

Tyr Phe His Gly Gly Pro Arg Gly His His Trp Met Trp Asp Ser Arg 
370 375 380 

Leu Phe Asn Tyr Gly Ser Trp Glu Val Leu Arg Phe Leu Leu Ser Asn 
385 390 395 400 

Ala Arg Trp Trp Leu Glu Glu Tyr Lys Phe Asp Gly Phe Arg Phe Asp 
405 410 415 

Gly Val Thr Ser Met Met Tyr Thr His His Gly Leu Gin Met Thr Phe 
420 425 430 

Thr Gly Asn Tyr Gly Glu Tyr Phe Gly Phe Ala Thr Asp Val Asp Ala 
435 440 445 

Val Val Tyr Leu Met Leu Val Asn Asp Leu lie His Gly Leu His Pro 
450 455 460 

Asp Ala Val Ser lie Gly Glu Asp Val Ser Gly Met Pro Thr Phe Cys 
465 470 475 480 

lie Pro Val Pro Asp Gly Gly Val Gly Phe Asp Tyr Arg Leu His Met 
485 490 495 
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Ala Val Ala Asp Lys Trp lie Glu Leu Leu Lys Gin Ser Asd Glu Ser 

500 505 510 

Trp Lys Met Gly Asp lie Val His Thr Leu Thr Asn Arg Arg Trp Leu 

5 515 520 525 

Glu Lys Cys Val Thr Tyr Ala Glu Ser His Asp Gin Ala Leu Val Gly 

530 535 540 

10 Asp Lys Thr lie Ala Phe Trp Leu Met Asp Lys Asp Met Tyr Asp Phe 

545 550 555 560 



15 



30 



45 



60 



Met Ala Leu Asp Arg Pro Ser Thr Pro Arg lie Asp Arg Gly lie Ala 
565 570 575 

Leu His Lys Met lie Arg Leu Val Thr Met Gly Leu Gly Gly Glu Gly 
580 585 590 



Tyr Leu Asn Phe Met Gly Asn Glu Phe Gly His Pro Glu Trp lie Asp 

20 595 600 605 

Phe Pro Arg Gly Pro Gin Thr Leu Pro Thr Gly Lys Val Leu Pro Gly 
610 615 620 

2 5 Asn Asn Asn Ser Tyr Asp Lys Cys Arg Arg Arg Phe Asp Leu Gly Asd 

625 630 635 640 



Ala Asp Phe Leu Arg Tyr His Gly Met Gin Glu Phe Asp Gin Ala Met 
645 650 655 

Gin His Leu Glu Glu Lys Tyr Gly Phe Met Thr Ser Glu His Gin Tyr 
660 665 670 



Val Ser Arg Lys His Glu Glu Asp Lys Val lie lie Phe Glu Arg Gly 

35 675 680 685 

Asp Leu Val Phe Val Phe Asn Phe His Trp Ser Asn Ser Phe Phe Asp 

690 695 700 

40 Tyr Arg Val Gly Cys Ser Arg Pro Gly Lys Tyr Lys Val Ala Leu Asp 

705 710 715 720 



Ser Asp Asp Ala Leu Phe Gly Gly Phe Ser Arg Leu Asp His Asp Val 

725 730 735 

Asp Tyr Phe Thr Thr Glu His Pro His Asp Asn Arg Pro Arg Ser Phe 
740 745 750 



Ser Val Tyr Thr Pro Ser Arg Thr Ala Val Val Tyr Ala Leu Thr Glu 
50 755 760 765 

(2) INFORMATION FOR SEQ ID NO: 1 3: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 10550 base pairs 
55 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 



WO 99/14314 



PCT/AU98/00743 



36 - 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: triticum tauschii 

(ix) FEATURE: 
5 (A) NAME/KEY: exon 

(B) LOCATION: 1. .3 16 

(D) OTHER INFORM ATION:/product= "cxon I" 

(ix) FEATURE: 
1 0 (A) NAME/KEY: cxon 

(B) LOCATION: 1 472.. 1 828 

(D) OTHER INFORMATION:/product= "exon 2" 

(ix) FEATURE: 
1 5 (A) NAME/KEY: cxon 

(B) LOCATION:2766..2823 

(D) OTHER lNFORMATION:/product= "exon 3" 

(ix) FEATURE: 
2 0 (A) NAME/ KEY: exon 

(B) LOCATION:2906..3028 

(D) OTHER INFORMATION:/product= "exon 4" 

(ix) FEATURE: 

2 5 (A) NAME/KEY: exon 

(B) LOCATION:41 13..4194 

(D) OTHER INFORMATION:/product= "exon 5" 

(ix) FEATURE: 

3 0 (A) NAME/KEY: exon 

(B) LOCATION:4286..4459 

(D) OTHER INFORMATION:/product= "exon 6" 

(ix) FEATURE: 

3 5 (A) NAME/ KEY: exon 

(B) LOCATION:4562..4643 

(D) OTHER INFORMATION :/product= "exon 7" 

(ix) FEATURE: 

4 0 (A) NAME/KEY: exon 

(B) LOCATION:4744..4855 

(D) OTHER INFORMATION:/product= "exon 8" 

(ix) FEATURE: 

4 5 (A) NAME/KEY: exon 

(B) LOCATION:4999..502 1 

(D) OTHER INFORM ATION:/product= "exon 9" 

(ix) FEATURE: 

5 0 (A) NAME/KEY : exon 

(B) LOCATIONS 102..5 192 

(D) OTHER INFORMATION :/product= "exon 10" 

(ix) FEATURE: 
5 5 (A) NAME/KEY: exon 

(B) LOCATION:8593..8718 
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(D) OTHER INFORMATION:/product= "exon 11" 




5 


(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION:8807..89 1 5 

(D) OTHER INFORM ATION:/product= "exon 12" 




10 


(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION:8992..9104 

(D) OTHER INFORMATION:/product= "exon 13" 




15 


(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATIONS 161. .9 199 

(D) OTHER INFORM ATION:/product= "exon 14" 




20 


(ix) FEATURE: 

(A) NAME/KEY: exon 

(B) LOCATION:9498..9713 

(D) OTHER INFORM ATION:/product= "exon 15" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 




25 


ATGGCGGCGA CGGGCGTCGG CGCCGGGTGC CTCGCCCCCA GCGTCCGCCT 


50 




GCGCGCCGAT CCGGCGACGG CGGCCCGGGC GTCCGCTTGC GTCGTCCGCG 


100 


30 


CGCGGCTCCG GCGCTTGGCG CGGGGCCGCT ACGTCGCCGA GCTCAGCAGG 


150 


GAGGGCCCCG CGGCGCGCCC CGCGCAGCAG CAGCAACTGG CCCCGCCGCT 


200 




CGTGCCAGGC TTCCTCGCGC CGCCGCCGCC CGCGCCCGCC CAGTCGCCGG 


250 


35 


CCCCGACGCA GCCGCCCCTG CCGGACGCCG GCGTGGGGGA ACTCGCGCCC 


300 




GACCTCCTGC TCGAAGGTAA AAAACAAGGC TGAATCCTCA GATCACTCCG 


350 


40 


CGTCTTCGTT TTACC AAATA CGGTACTGCG A AGTGGTGCT GTATATGTGA 


400 


AGTTTCTGTC GATTTCTTCC TGACGGATGT TCAGTCG ATT CAGTTGTATA 


450 




TATGTGATAC GTTCGTTGTT CATCGATCGT ACAGATTTAC CAGCACACTA 


500 


45 


GATAGAAATC GAGACCGACG CGGGCAGATC AATAGAlTi'l TCTAGACGTT 


550 




TTATTGGATC GTG AG ATGAT TGATTGGGGT GGCGTGTCG A TACG ATAGCG 


600 


50 


GTGCACCGCC GATGTATCGG GGCATGTGCA CGTGGTTGGG TCTCAGCAGA 


650 


CATATCACTA GACTGGTATC GTAATTTACT AGTACTACTG GAAAGAGGAC 


700 




TAAAAAGGCT AGGCCAAGTG CACGCATGTT GGGAACGTTG TTAAATTGAT 


750 


55 


GAGTTTGTCC TTTGCTTGGG CTGGTATTAT TACCAAAAAA TGGTGTTAGT 


800 
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CCCTGTACTT ATTAATGGGA AAATCTTAAC ATGACACTGG GGTTTATGAG 


850 




TCTCCAATTG TATATTCTCA GCACTCAACT GAT1TI ACTG ATACTGTAGT 


900 


5 


GGAAATGACA CGTGAGCACC CCCCTTCAAG GAATGCAATG CTTCTTTCTG 


950 




TTTTATATTA CAGGAACTAG AAGGAGCTTC CACCTTTGAG TACAGAAGTA 


1000 


10 


CTCCCTCCGT TCCAAAATAG ATGACTCAAC TTTGT A CT A A 1TLTGTACTA 


1050 


TAGTTAGTAC AAAGTTGAGT CATCTA Tl'lT AGAACGGAGG GAGTAGTATC 


1100 




GAAATTGAAG ACCCTTGTAT TACTGTCTTG TTTTTCAATG AAAATGGGAG 


1150 


15 


GCCCATGCAG TAAGTCACAT GGGCACCTGG GAGGCTGGGA TCATGTGTGC 


1200 




TTTGCAGAGT ACTAGACCCA GCTCACCCTC TGTTAG ATT A CTTGTTGGGC 


1250 


20 


TGCTACTTTG TGTTTGCTGT GCAGTATATC AGACATCCTG AATTTGGCAT 


1300 


CTAGCTGAGA ACAGAATGCA GGTTGCACCA TTCTTATTAT TGCTAAACTG 


1350 




TTGTCACGCA ATTT AT A A AG AATGTGATCT TCTG AGTATT AATTAATCAT 


1400 


25 


GTTCTGCTA A TATCTGTCCT CGCTCTGGTG TTGACAAATA TACCATATGA 


1450 




ATATTTTCCA TTTTGCAACC AGGG ATTGCT GAGGATTCCA TCGACAGCAT 


1500 


30 


AATCGTGGCT GCAAGTGAGC AGGATTCTGA GATCATGGAT GCGAATGAGC 


1550 


AACCTCAAGC TA AAGTTAC A CGTAGCATCG TGTTTGTGAC TGGTGAAGCT 


1600 




GCTCCTTATG CAAAGTCAGG GGGGCTGGGA GATGTTTGTG GTTCGTTACC 


1650 


35 


AATTGCTCTT GCTGCTCGTG GTCACCGTGT G ATGGTTGTA ATGCC AAG AT 


1700 




ACTTGAATGG GTCCTCTGAT AAAAACTATG CAAAGGCATT ATACACTGCG 


1750 


40 


AAGCACATTA AGATTCCATG CTTTGGGGGA TCACATGAAG TGACC IT 11 1 


1800 


TCATGAGTAT AGAG ACAACG TCGATTGGGT GGGTACACAA TCACCTTCTT 


1850 




ATTCTCTGTT GAATTGTAGC AACTGTTTAT CCTTGTTTAC ACTTCTTTTA 


1900 


45 


GCCCTGC AAA GACATATGTG ATTTCCATAC TTTTTTGTTA TTTCCCTTGT 


1950 




ACTCTTGCTC ATGAAGGTCA A A AT ATC AT A TATCCATGGA AGTCATGCAT 


2000 


50 


GTGCCTAGTA TTTTTGGTGT CGGTGCCTTT AACTTTCAGG GATTAATACG 


2050 


TGGAATTTGA TAACTAAAGT TT AT ITT ATT GAAAAAAATT GTAGGTTGG 


2100 




TGAGCCCACA GCCACGCAGT GGCACCACTG CTTGCACATG ATITTGCATT 


2150 


55 


TCTGTTTGCA CCG A G C ACTT CATGTGAATA AGGTGTAAAA TCATAAAGTA 


2200 
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CCAATTTTAT TCTGCCAATT GCACTTA AGA GTATATACAT TTATCTTGGC 2250 

CTCAATCATG GGAGTACTGT GCATTCAGTG CACCATCATT GTTCT A A G G A 2300 

5 GAAAATGTGG GTGCAAGGAA GACACTTTTG TCCCTT A ATA AAAGGCAGGC 2350 

ACTCTGTTGT CATATAGATA GAAAGCAACA AACTTATTTC AAAGAGCTAA 2400 

CAATGGCAAA AGAACCAAAA AAAGCATGCT AAGGCGGTGA CACCAAAAGG 2450 

10 

TGAGGGGGGC CTTGTGACTG ACAGCACCCC AAACTATTGC CATTGTTTTA 2500 

CTAAATGAAG ATCATTTTAG AAGCTCTCAG GAACTTCGAA AACAGTGGCT 2550 

1 5 TTCCGTCC AC AG ATCGTCTG TT A AT ATTTT TGTCC AGTG A TACTTTTTTT 2600 

GCTCCTTACA AGAGTGCCTA TGTTGACATA TACATTGTTA AGTTGTTCAT 2650 

AAGTTTACTT CTTATTCTAA ACAGCAAGTG CCTAATGCTT GCATTTATTT 2700 

20 

TGGCTATTTA TTTTTATTCT CATTTCAATC AACACTTTTG TTCAGGTGTT 2750 

TGTCGATCAT CCGTCATATC ATAGACCAGG AAGTTTATAT GGAGATAATT 2800 

2 5 TTGGTGCTTT TGGTGATAAT CAGGTACACT ACACTATACT AAGCTCCTAG 2850 

TTGACTAAGT CGTA AGTTGT ACCTCCTCGC TGACCGGCTG CTCTATGTCG 2900 

TGCAGTTCAG ATACACACTC CTTTGCTATG CTGCATGCGA GGCCCCACTA 2950 

30 

ATCCTTGAAT TGGG AGGATA TATTTATGGA CAGAATTGCA TGTTTGTTGT 3000 

GAACGATTGG CATGCCAGCC TTGTGCCAGT GTACGTTGTT TGTGGATCTG 3050 

3 5 AAAGTCC A AT CCTTTATTCA TTCTCTGCTT TGCAGTGTGC CCATGTCTAC 3 1 00 

ATTTCTTTTA TGCTTTTTTC ATGTCTGTTC TTATATTGC A TATATGCTTA 3 1 50 

TGGAGTCTAA AAGTTACCGG AGGG AATAAC TCTTAAGGAT TTCCTCAATC 3200 

40 

AATTATCTTT AGCTTTAGTT AACATTTACT GTGGCAAACA TAATGTGTTT 3250 

TGAGATTTAC AAGTTCAGAG ATTGCACTTC ACTA GTTCGT AGCTAATCTG 3300 

4 5 ATGTTTTCCC CGAGAAAATG CCTAAAGCTT TGTGTCTTGA TGCATTGATA 3350 

GAAAAAG AGT TTATGTACAC TCCCAA AGAG GGG ACCCAAA ATTACAACAC 3400 

CACACCCCTG AGAACTAGGC GCTGCCGGAA GAAGCGATGC AAGCCCCACT 3450 

50 

GCCCCTGCCT TAGCTCAAAG CCGGGCGTCA GCTTG ATTGT GTCAAGTAAG 3500 

CTAGCAGTGC TAGATTGCGC AAGGTCGATT CGTCGAAGAT GACAGTGTTG 3550 

5 5 CGCTGCTTCC AAATCCACCA AACTATGAGC ATG ATCACTG GAGAAGTACC 3600 



TTTTCTCGCG GCTGAGGGGG TGG ACTGGTG GTCTGCTGCT GCCAGTTTTC 
AGATAATCTG AAA AATGC AT GTTTTG ATGA TTTTAGTATC TTGCGGACCC 
TGGGTACCAC CTAAGCTTTC ACACAGTAAT TTGCAGTTAC ACCTATAAAA 
GTAACGGTCA TGATATGCAT GTGTTTTGGG TAGATCATGG TGCATGCATT 
TTAGGAATTA G G AC ATGCC A GAACCACGTG AGGCTTATGG GGCAATTCAT 
TTGTTCCATT ATACG AGTCA TGA ATATGGT TCAGCATGTT TGG ACGCTAC 
TTGTTTGGGG CAATTTCAGA TGGTGAATTG TAGCTGCTTG ATGTTGGCTA 
GCTGGCTTAT TTTGTACAAG TATCG ATGTT AGATGCATAT TTCCTTTTGT 
TCTTGTGCTG TTTGCCATGT TGTATTCCCC TTTTCTGTCG CCAGTGTTGC 
ATGTTAAATT GGTTTTCATT ACATAATCAA CTTTGTTGCT GACATCAGTC 
ATTTTTATTC AGCCTTCTTG CTGCAAAATA TAGACCATAC GGTGTTTACA 
GAGATTCCCG CAGCACCCTT GTTATACATA ATTTAGCACA TCAGGTTTGG 
GTCTATCACC TTTCATTATC CGTACATGGC TTTGTA AGTC GGTTCACACG 
TATCGTCATA CTGTATGTTA TTTC AATGTC ATTAGGGTGT GGAGCCTGCA 
AGTACATATC CTG ATCTGGG ATTGCCACCT GAATGGTATG GAGCTTTAGA 
ATGGGTATTT CCAGAATGGG CAAGGAGGCA TGCCCTTGAC AAGGGTGAGG 
CAGTTAACTT TTTGAAAGGA GCAGTTGTG A CAGCAGATCG AATTGTGACC 
GTCAGTCAGG TGAAATACTC AATACTTCTC TTTTTTCTTT GCGGGATGTT 
CTTCAGTTCA ATTGCCCTGT CTTTCACCCA ATTAAGAAAT G ATTT A ATCT 
TTTGTTTCTA GGGTTATTCA TGGGAGGTCA CAACTGCTGA AGGTGGACAG 
GGCCTCAATG AGCTCTTAAG CTCCCGAAAA AGTGTATTGA ATG GT A ACT A 
TATTTGAATC CACTTATCTT CTTCTG A A AC ATATTTACAG AAATAGATGG 
ATGGGTTGCA AGAATAAATT CAGTTTGCTC TTTCGGTATG AAGGAATTGT 
AAATGGAATT G AC ATT A ATG ATTGGAACCC CACCACAGAC AAGTGTCTCC 
CTCATCATTA TTCTGTCG AT GACCTCTCTG GAAAGGTGTG TGGATAGTAC 
CCTATATA AT A ACATGTATA TCTG ATCTAG TACTTTCTTT TTCTTTGCTA 
GTTTGCTTCC C ATG ATGTTC TCACTAACTA ATCCTATGTG GTTTGGCATA 
CTTGTCAGGC CAAATGTAAA GCTGAATTGC AGAAGGAGCT GGGTTTACCT 
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GTAAGGGAGG ATGTTCCTCT GGTTAGATAC AAACCCCTAA G ATATATATT 


5050 




TTTTAAATCC CTAAAAAAAA CTTGCCGATC ATCTC ATT A G CTTGATTCAC 


5100 


5 


• AG ATTGGCTT TATTGG AAGA CTGGATTACC AGAAAGGCAT TGATCTCATT 


5150 




AAAATGGCCA TTCCAGAGCT CATGAGGGAG G ACGTGCAGT TTGTAAGTTC 


5200 


10 


ATATTCTTTT TCTTGAGACT AGAGTATAAA TCAAACATGT AGGTGTGGGG 


5250 


TGGTATAATA CAGACATAAG TTCCAGCTAT TGCTTCCATG AGAAT1T1 AA 


5300 




TGCTATTCAG TAATATGCTA CTGC A AGTTT TG AA ACAAAG TTGGAAGCAA 


5350 


15 


TAAATATATG TGTAGCACTG ACCATGCAGT GCCACTATAG CTGGAATGTC 


5400 




CTGT A GTCT A TGTGATCTAA CACACTCAAC AACATG'i 11 1 CGCATACAAA 


5450 


20 


CACATGCGTG CGCGCAACAA ACATACTCTA CAATAAAATT GGCTTGGTGA 


5500 


ACTGCAG ACA TGCTCTTATC TCCATTCCA A CATTTCTTGT TTCAACATTG 


5550 




GCTGAAGACT AAGAGAAGGG GGACCCAGGG TGATGTAGCC AACTAGATCC 


5600 


25 


AGTAAGGAAG CTAGCCGAGC CTAGGAGGAT TCGCTTAGGT AGCTGGAACG 


5650 




TAGGGTCTCT GACAGGGAAG CTTCGGGAGC TAGTCGATGC AGTGGTGAGG 


5700 


30 


AGAGGTGTTG ATATCCTTTG CGTCCAAGAA ACCAAATGTA GGGGACAGAA 


5750 


GGCGAAGGAG GTGGAGGATA CCGGCTTCAA GCTGTGGTAC ATGGGACGGC 


5800 




TGCAAACAGA AATGGCGTAG GCATCTTGAT C AAC A AG AGC CTTA AGTATG 


5850 


35 


GAGTGGTAGA CGTCAAGAG A CGTGGGGACC GGATTATCCT CGTCAAGCTG 


5900 




GTAGTTGGGG ACTTAGTTCT CAATGTTATC AG CGTGT ATG CCCCGCAAGT 


5950 


40 


AGGCCACAAT GAGAACGCCA AGAGGGAGTT CTGGGAAGGC CTGGAAGACA 


6000 


TGGTTAGGAG TGTACCGATT GGCGAGAAGC TCTTCATAGG AGGAGACCTC 


6050 




AATGGCCACG TGGGTACATC TAACATAGGT TTTGAAGGGG CAC ATGGGGG 


6100 


45 


CTTTGGCTAT GGCATCAAGA ATCAAGAAGA AGATGTCTTA CGCl'l 1GCTC 


6150 




TAGCCTACGA CATGATTGTA GCTAACACCC TCTTTAGAAA GAGAGAATCA 


6200 


50 


CATCTGGTGA CTTTTAGTAG TGGCCAACAC TAGCCAGATC GATTTCATCC 


6250 


TCTCGAGAAG AGAAGATAGG TGTGCGCGCC TAGACTGCAA GGTGATACCT 


6300 




TCGGATTCGT GTCCAGCGGG ATAAGCGTGC CAAAGTCGCT AGAATGAAGT 


6350 


55 


GGTGGAAGCT CAAGGGGGAG GTAGCTCAGG CGTTCAAGGA GAGGGTCATT 


6400 
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AGGGAGGGCC CTTGGGAGGA AGGAGGGGAT GCGGACAATG TGTGGATGAA 6450 

GATGGCG ACT TGCATTCGTA AGGTGGCCTC GG AGGAGTGT GGAGTGTCCA 6500 ; 

5 GGGGATGGAG AAGCGAAGAT AAGG ATACCT GGTGGTGGAA TGATGATGTC 7000 

CAGAAGGCAA TTAAAGAGAA GAAAGATTGC TTTAGACGCC TATACTTGGA 7050 

TAGGAGTGCA GTCAACATAG AAAAGTACAA G ATGGCG A AG AAGGCCGCAA 7 100 

10 

AGCG AGCTGT CAGTGAAGCA AGGGGTCGGG CATATGAGG A TCTCTACCAA 7 1 50 

CGGTTAGGCA CGAAGGAAGG CGAAAGGGAC ATCTATAAGA TGGCCAAGAT 7200 

CCG AGAG AG A GGA AG ACG AG GG ATATTGGC CAAGTCAAAT GCATCAAGGA 7250 

1 5 TGGAGCAG AC C AACTCTTGG TGA AGGACGA GG AGATTAAG CATAG ATGGC . 7300 

GGGAGTACTT CGACAAGCTG TTCAATGGGG AGGATGAGAG TCCTACCATT 7350 

GAACTTGACG ACTCCTTTG A TGAGACCATC ATGCGTTTTA TGCGGCGAAT 7400 

CCAGGAGTCC GAGGTCAAGG AGGCTTTAAA AAGGAGGCAA GGCGATGGGC 7450 

CCTGATTGTA TCCCC ATTG A GGTGTGGAAA GGCCTCGGGG ACATAGCGAT 7500 

2 0 AGTATGGCTA ACCAAGCTAT TCA ACCTCAT TTTTCGGGCA AACAAG ATGC 7550 

CAGAAGAATG GAGACGAAGT ATATTAGTAC CAATCATCAA ACAGGGGGGA 7600 

TGTTC AG AGT TGTACTAATT ACCATGGAAT TAAGCTGATG AGCCATACAA 7650 

TGAAGCTATG GGAGAGAATC ATTGAGCACC GCTTAAGAAG AATGACAAGC 7700 

GTGACCAAAA ATCAGTTTGG TTTCATGCCT GGGAGGTCGA CCATGGAAAC 7750 

2 5 CATTTTCTTG GTACGACAAC TTATGGAGAG ATACAGGGAG CAAAAG AAGG 7800 

ACTTGC ATAT GGTGTTCATT GACTTG AAGA AGGCCTATAA TAAGATACCG 7850 

CGGAATGTCA TGTGGTGGGC CTTGGAGAAA CACAAAGTCC CAGCAAAGTA 7900 

CATTACCCTC ATCAAGGACA TGTACGATAA TGTTGTGACA AGTGTTCGAA 7950 

CAAGTGATGT CGACACTAAT GACTTCCCGA TTAAGATAGG ACTGCATCAG 8000 

3 0 GGGTCAGCTT TGAGCCCTTA TCTTTTTGCC TTGGTGATGG ATGAGGTCAC 8050 

A AGGG ATAT A C A AGG AGATA TCCC ATGGTG TATGCTCTTT GTGG ATG ATT 8 1 00 

TGGTGCTAGT TGACG ATAGT CGGGCGGGGG TAAATAACAA GTTAG AGTTA 8 1 50 

TGGAGACAAA CCTTGGAATC GAAAGGGTTT AGGCTTAGTA GAACTAAAAC 8200 

CGAGTACATG ATGTGCGGTT TCAGTACTAC TAGGTGTGAG G AGGAGGAGG 8250 
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TTAGCCTTG A TGGGCAGGTG GTACCCCAGA AGGACACCTT TCG AT ATTTG 8300 

GGGTCAATGC TGCAGGAGGA TGGGGGTATT GATGAAGATG TGAACCATCG 8350 

AATC AAAGCT GGATGGATGA AGTGGCGCCA AGCTTCTGGC ATTCTTTGTG 8400 

ACAAGAG AGT GCCACAAA AG CTAAGGCAAG TTCTACAGG A CGGCGGTTCG 8450 

5 ACCCGCAATG TTGTATGGCG CTGAGTGTTG GCCGACTAAA AGGCGACATG 8500 

TTCAACAGTT AGGTGTGGCG GAGATGCGTA TGTTGAGATG GATGTGTGGC 8550 

CACACGAGG A AGRATCGAGT CCGGA ATGAT GATATACGAG ATAGAGTTGG 8600 

GGTAGCACCA ATTGAAGAGA AGCTTGTCCA ACATCGTCTG AGATGGTTTG 8650 

GGCATATTCA GCGCACGCCT CCGAAAACTC CAGTGCATAA CGGACGGCTA 8700 

1 0 AAGCGTGCGG AGAATGTC AA G AG AGGGCGG GGTAG ACCG A ATTTG ACATG 8750 

GGAGGAGTCC GTTAAGAGAG ACCTGAAGGT TTGGAGTATT ACGAAAGAAC 8800 

TAGCTATGGA CARGGGTGCG TGGAAGCTTG TTATCCATGT GCCAGAGCCA 8850 

TGAGTTGATC ACGAGATCTT ATGGGTTTCA CCTCTAGCCT ACCCCAACTT 8900 

GTTTGGG ACT AAAGGCTTTG TTGTTGTTGT TGTTGTTGTT GTTGTAGCCA 8950 

1 5 ACTAAATCCA GTTG ATC AGT GGTTTTTACT CTTATTTTTA C AGGTCATGC 9000 

TTGGATCTGG GGATCCAATT TTTGAAGGCT GGATGAGATC TACCGAGTCG 9050 

AGTTACA AGG ATAA ATTCCG TGG ATGGGTT GG ATTTAGTG TTCCAGTTTC 9 1 00 

CCACAG AATA ACTGCAGGGT ATGCCGAG AA CTTCTTA ACA AG ACCTTCGT 9 1 50 

TATCAGCTTG GATATATTAT AATGTTCAAA ACATTTATGT CTCTCTTTTT 9200 

2 0 GTGCAGTTGC GATATATTGT TAATGCCATC C AGGTTTG AA CCTTGTGGTC 9250 

TTAATCAGCT ATATGCTATG CAATATGGTA CAGTTCCTGT AGTTCATGGA 9300 

ACTGGGGGCC TCCGAGTAAG ACAACTGCCT TGAAAATTAT CGTTATCTTG 9350 

GCTCCAACGC AAATGTTCTA ATTGGCTCGT GTATTCAACA GGACAC AGTC 9400 

GAGACCTTCA ACCCTTTTGG TGCAAAAGGA GAGGAGGGTA CAGGGTACGC 9450 

2 5 ACTGCTCAAT TTTAGCTAAC TTTCAGTTTA TCTTTTTGCA ATGTCTTGGG 9500 

GGTTCATTGC GCCATAAATC AACTTGTGAT AATTAACTGT TACTGTTCTG 9550 

TACTTGCAGG TGGCCGTTCT CACCGCTAAC CGTGGACAAG ATGTTGTGGG 9600 

TAAGTTTTTG CTGAGCTCTT GTCCGGTTAT AGGATCGACC TTGGCTGTAG 9650 
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CATGGTACCT TAGTGCCCCT TG TAT AT AG A CCTA ACCTG A TGG ACTCACT 9700 

TTGTCTACAC TAATCATAGT AGTCGATTGC CCGGAGGCGT TTTGCTTGGA 9750 

TTCTGCTAAT TT A ATTTTC A TG ACG ATAAC TCATACCATG GTTTGGTTCT 9800 

CCGATGGGGG CCAGAATGGC GTCTAGTGTC TGCGATCTGT GTAACTAGCC 9850 

5 AATGCCGGGT TGTTCCAAGT GAAAATTTAC CTTTTG ACCA TTGTGCAGGC 9900 

ATTGCGAACC GCG ATGTCG A CATTCAGGGA GCACAAGCCG TCCTGGGAGG 9950 

GGCTCATGAA GCGAGGCATG ACGAAAGACC ATACGTGGGA CCATGCCGCC 10000 

GAGCAGTACG AGCAGATCTT CGAATGGGCC TTCGTGGACC AACCCTACGT 10050 

CATGTAGACG GGGACTGGGG AGGTCGAAGC GCGGGTCTCC TTGAGCTCTG 10 100 

1 0 AAGACATGTT CCTCATCCTT CCGCGGCCCG GAAGGATACC CCTGTACATT 10150 

GCGTTGTCCT GCTACAGTAG AGTCGCAATG CGCCTGCTTG CTTGGTCCGC 10200 

CGGTTCGAGA GTAGATGACG GCTGTGCTGC TGCGGCGGTG ACAGCTTCGG 10250 

GTGGATGACA GTTACAGTTT TGGGGAATAA GGAAGGGATG TGCTGCAGGA 10300 

TGGTTAACAG CAAAGCACCA CTCAGATGGC AG CCTCTCTG TCCGTGTTAC 10350 

1 5 AGCTG AAATC AGAAACCAAC TGGTGACTCT TTAGCCTTAG CGATTGTGAA 10400 

GTTTGTTGCA TTCTGTGTAT GTTGTCTTGT CCTTAGCTGA CAAATATTTG 10450 

ACCTGTTGGA T A ATTCT ATC TTTGCTGCTG TTTTTCTTTT GGTCAAAAGA 10500 

GGGGTTCCCT CCGATTTCAT TAACGAAACC ACCAAAATAA CAGCACCCAG 10550 

TGCAGGTCTC AGGTTCAGAT ATACTTAAGA CT ACT A A ATC TAACAGCAGC . 10600 

2 0 TA A AA AGCTT AA AG ATTCAG GCG AC ATAAC CG A AC AAA AT CC AC A ACCG A 1 0650 

AGGG ACC A A A GC AGG AC A AG TA A A A AGGC A GNCG AC AC A A AGCGC AGGTC 1 0700 

GCTGAAAAGG CAAGCAGACA GAGGTCTGCA TTCTGTCAAC ACCACTTGTG 10750 

AAAAATGAAG AGAAGATCGA GAATTCCCGG GAATCCG 10787 



(2) INFORMATION FOR SEQ ID NO: 14: 
2 5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 647 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 



WO 99/14314 



PCT/AU98/00743 



10 



15 



30 



45 



60 



- 95 - 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

(ix) FEATURE: 

(A) NAME/KEY: Protein 

(B) LOCATION: I. .647 

(D) OTHER INFORMATION:/product= "deduced amino acid 
sequence for SSS I" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Ala Ala Thr Gly Val Gly Ala Gly Cys Leu Ala Pro Ser Val Arg 
15 10 15 

Leu Arg Ala Asp Pro Ala Thr Ala Ala Arg Ala Ser Ala Cys Val Val 
20 25 30 



Arg Ala Arg Leu Arg Arg Leu Ala Arg Gly Arg Tyr Val Ala Glu Leu 
20 35 40 45 

Ser Arg Glu Gly Pro Ala Ala Arg Pro Ala Gin Gin Gin Gin Leu Ala 
50 55 60 

2 5 Pro Pro Leu Val Pro Gly Phe Leu Ala Pro Pro Pro Pro Ala Pro Ala 

65 70 75 80 



Gin Ser Pro Ala Pro Thr Gin Pro Pro Leu Pro Asp Ala Gly Val Gly 
85 90 95 

Glu Leu Ala Pro Asp Leu Leu Leu Glu Gly lie Ala Glu Asp Ser lie 
100 105 110 



Asp Ser lie lie Val Ala Ala Ser Glu Gin Asp Ser Glu lie Met Asp 
35 115 120 125 

Ala Asn Glu Gin Pro Gin Ala Lys Val Thr Arg Ser lie Val Phe Val 
130 135 140 

40 Thr Gly Glu Ala Ala Pro Tyr Ala Lys Ser Gly Gly Leu Gly Asp Val 

145 150 155 160 



Cys Gly Ser Leu Pro lie Ala Leu Ala Ala Arg Gly His Arg Val Met 
165 170 175 

Val Val Met Pro Arg Tyr Leu Asn Gly Ser Ser Asp Lys Asn Tyr Ala 
180 185 190 



Lys Ala Leu Tyr Thr Gly Lys His lie Lys lie Pro Cys Phe Gly Gly 
50 195 200 205 

Ser His Glu Val Thr Phe Phe His Glu Tyr Arg Asp Asn Val Asp Trp 

210 215 220 

55 Val Phe Val Asp His Pro Ser Tyr His Arg Pro Gly Ser Leu Tyr Gly 

225 230 235 240 



Asp Asn Phe Gly Ala Phe Gly Asp Asn Gin Phe Arg Tyr Thr Leu Leu 
245 250 255 

Cys Tyr Ala Ala Cys Glu Ala Pro Leu lie Leu Glu Leu Gly Gly Tyr 
260 265 270 
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25 



40 



55 



He Tyr Gly Gin Asn Cys Met Phe Val Val Asn Asp Trp His Ala Ser 
275 280 285 

Leu Val Pro Val Leu Leu Ala Ala Lys Tyr Arg Pro Tyr Gly Val Tyr 
290 295 300 

Arg Asp Ser Arg Ser Thr Leu Val He His Asn Leu Ala His Gin Gly 
305 310 315 320 

Leu Glu Pro Ala Ser Thr Tyr Pro Asp Leu Gly Leu Pro Pro Glu Trp 
325 330 335 



Tyr Gly Ala Leu Glu Trp Val Phe Pro Glu Trp Ala Arg Arg His Ala 

15 340 345 350 

Leu Asp Lys Gly Glu Ala Val Asn Phe Leu Lys Gly Ala Val Val Thr 

355 360 365 

20 Ala Asp Arg He Val Thr Val Ser Gin Gly Tyr Ser Trp Glu Val Thr 

370 375 380 



Thr Ala Glu Gly Gly Gin Gly Leu Asn Glu Leu Leu Ser Ser Arg Lys 

385 390 395 400 

Ser Val Leu Asn Gly lie Val Asn Gly He Asp He Asn Asp Trp Asn 

405 410 415 



Pro Thr Thr Asp Lys Cys Leu Pro His His Tyr Ser Val Asp Asp Leu 
30 420 425 430 

Ser Gly Lys Ala Lys Cys Lys Ala Glu Leu Gin Lys Glu Leu Gly Leu 
435 440 445 

3 5 Pro Val Arg Glu Asp Val Pro Leu He Gly Phe He Gly Arg Leu Asp 

450 455 460 

Tyr Gin Lys Gly He Asp Leu He Lys Met Ala He Pro Glu Leu Met 
465 470 475 480 



Arg Glu Asp Val Gin Phe Val Met Leu Gly Ser Gly Asp Pro He Phe 
485 490 495 



Glu Gly Trp Met Arg Ser Thr Glu Ser Ser Tyr Lys Asp Lys Phe Arg 
45 500 505 510 

Gly Trp Val Gly Phe Ser Val Pro Val Ser His Arg He Thr Ala Gly 
515 520 525 

50 Cys Asp He Leu Leu Met Pro Ser Arg Phe Glu Pro Cys Gly Leu Asn 

530 535 540 



Gin Leu Tyr Ala Met Gin Tyr Gly Thr Val Pro Val Val His Gly Thr 

545 550 555 560 

Gly Gly Leu Arg Asp Thr Val Glu Thr Phe Asn Pro Phe Gly Ala Lys 

565 570 575 



Gly Glu Glu Gly Thr Gly Trp Ala Phe Ser Pro Leu Thr Val Asp Lys 

60 580 585 590 

Met Leu Trp Ala Leu Arg Thr Ala Met Ser Thr Phe Arg Glu His Lys 

595 600 605 
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Pro Ser Trp Glu Gly Leu Met Lys Arg Gly Met Thr Lys Asp His Thr 
610 615 620 

Trp Asp His Ala Ala Glu Gin Tyr Glu Gin lie Phe Glu Trp Ala Phe 
625 630 635 640 

Val Asp Gin Pro Tyr Val Met 
645 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5072 base pairs 

(B) TYPE: nucleic acid 

1 5 (C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

2 0 (iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 
2 5 (ix) FEATURE: 

(A) NAME/KEY: promoter 

(B) LOCATIONS. .4993 

(D) OTHER INFORM ATION:/function= "region containing 
promoter of SSS I" 

30 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 


15: 








TCTAGATGCA 


TGCTGGATAG 


CGGTCGATGT 


GTGGAGTAAT 


AGTAGTAGAT 


GCAGAATCGT 


60 


TTCGGTCTAC 


TTGTCGCGGA 


CGTGATGCCT 


ATATACATGA 


TCATACCTAG 


ATATTCTCAT 


120 


AACTATGCTC 


AATTCTATCA 


ATTGCTCGAC 


AGTAATTCGT 


TTACCCACCG 


TAATACTTAT 


180 


GATCTTGAGA 


GAAGTCACTA 


GTGAAACCTA 


TGCCCCCCAG 


GTCTATTTTG 


CATC AT AT T A 


240 


ATCTTCCAAT 


ACTTAGTTAT 


TTCCATTGCC 


GTTTATTTTA 


CTTTGTATCT 


TTATTTCTTT 


300 


TTATTATAAA 


AAAT AC C AAA 


AATATTATCT 


TATCATATCT 


ATCAGATCTC 


ATTCTCGTAA 


360 


GTGACCGTGA 


AGGGATTGAC 


AACCCCTTTA 


TCGTGTTGGT 


TGCGAGGTTC 


TTGTTTGTTT 


420 


GTGTAGGTGC 


GTGTGACTCG 


CACGTCTCCT 


ACTGGATTGA 


TACCTTGGGT 


TTTCAAAAAC 


480 


TGAGAAAAAT 


ACTTACGCTA 


CTTTACTGCA 


TAACCCTTTC 


CTCTTTAAAA 


AAAAAAACCA 


540 


ACGT AG T ATT 


CAAGAGGTAG 


CACGCTACCA 


TCCTCTCCAA 


CAGGAGCGCG 


GAGATCTTTG 


600 


TCCGGCAGGT 


TGATGCGGGC 


CGGGGAAGAA 


CTCCAGCTGC 


CTTGGCCAGC 


TTGGTCGTGA 


660 


GCCGCCCCAG 


CGGCGTCTTG 


AACCTGTCCA 


CGTAGCGCTC 


CCTGACACGC 


GGCGTGAACT 


720 


GAGAAGGCTT 


GTCGATGAAC 


TCCAGCTGTT 


GTGCCAGCCT 


AGCTTGCGCC 


TTCTTCTGCT 


780 


GGGTCATGCC 


CTTCGAGAAA 


CCCACCTTGG 


CCACCCTTGT 


GCTTGAGCGG 


CGCGCCACCT 


840 


CAGCAGGCGG 


CGGCGTGGGG 


ATGAAGAGGG 


TGTCTGCTTC 


CGGAGCAGGC 


GGGTCGGCGT 


900 



WO 99/14314 PCT/AU98/00743 

- 98 - 

TGAACTTGAA AGGCGGTGGC CCCATGATGG ATGGGGGGAG CATGCCAAAG ACTTGGTTGA 960 
GGAAAGTGGT GTTGGCGTCC ACCTCCAGTG CCTGCAGTTT GGAAGCCAGA CGATTGGCGT 102 0 
5 CGATCTCTGG CTCCGGCTGG AAGGAGGCTC GACGCTCCGG TGTGCCAGAA CGCAAAGGGA 1080 
GGAGCGGCAG CTCTGGCTGA GCAGACCCCG CGCCCATGTA CTCTGCATTG GGCCAAGGCT 114 0 
GCAGGGGCAA GCCACCGGGA TGGGGGCGCG AGGTGGACTG CGCACCGGAG GAAGGCCAAG 1200 

10 

CTCAACCTCG GTGAGGTTCG CCCCAGACCA GGGCGGCAGG CTCGGGTCCA CAAAGGGCCA 12 60 
AACCGCCTCG TCCGCCCCGA AACTGTCCAG GACAGACGGC GGACGACGGA AGGCCGTGTC 13 2 0 
15 GTCGAGCTCG AGCAGCAGAG GGTCCGTGCG GGTGATGTCT TGCCAAATGG ACTCCACCTC 13 80 
CAGCAGGAAG GGGGACTGGT CCATCGCCCC TGGCCAAGCC ACTGGTACGC CAAAGATGGC 144 0 
ATCAGCAGCG TTTGCACCAG GGGGAGCAGC CACACCTTGG AGGACAGGGA GGGTGCGGAC 1500 

20 

GTCGACGGCA GCAAAACGTG GCTGGAGCAA GTTGCCGTCG CGTGCCGGCC TCGGCGAGCG 15 60 
CGAGCGGCTG TAGGAGCGCT CGGTGCCCTC AGACTCGGAC AGTGCGCCAG TGGGAGAGCC 1620 

2 5 ATGGCGACGC CGGCCACCAC TGGACGTGCC ATGGCGCTGG TCCTGACGGC GCCTGGATGG 1680 

CCCGTCCTCG CGGGCAGCTC CACCTGAGCG GCACCCGAGG AGCACACCCC GCCAAGCTGG 17 40 
GCCAGGGCGG CTGCGGCGAC GGCGACGGCC GCGGTCGCGG TCTGCACCAT CATCTTCATC 1800 

30 

TTCGTCATCG TGGCGCCTCG GACAAGGATG CTCGCTGTCA CCGACGCGAG GGACGTGAGC 1860 
CGGCTCAGCC CGCCCTTCCT CGACGTGGCG AGCCCTGCGG ATATGCTCCT CGAGCGGCCA 1920 

3 5 TTGGGGGTCG TTGGCGCGCG GCATCTCGGG GTCGCGGTCA GCTATCGGGG TGTAGTCCTT 19 80 

TGTGGTGTCC AGGTGGATGA GCAGAGAGAA ATCCGGCCCC TCTAGCCCCT CGTCCCGGGG 2040 
GCAGCCCTCC GGCAGCGTCT GGCGGCCCCT GGGGTCCAGG GGTCGATCGA TGATGGAGAA 2100 

40 

CCCCCTTTTG GTGGGGATGT CGTCCGGACT CCATGCCCAC ACCCAGGCAA AGAGGCAGGC 2160 
CGTGTTGGAG AGGGAGGTCG TCTGCCGCTC CAACCAGTCG ACGTGGCATG TCTTCCCGAG 2220 
45 CGCATCCTGC CCCGCCTCCT TGTTCCAGGA CTGCACCGGC ATGTTCTCGA CGGCGATGCG 22 80 
GCAGTAGTAC CGCCAGACAC GGCGGTGGCC GTGTGCCGAT GGTGACCAGG CCGACAGGGA 23 40 
GAGCGCGACG CCCCAGCAGG AGACGACCCC AGCGTCGAAA GCGATGTCCC GGTGCCTGAA 2 400 

50 

GTGGACGAGC CCAGAGATGG CCAGGCGCAT TGACGCGGGG AAGGGGAAGG AGTTAGGATG 2460 
GGCGACGCGG CCGGAGTGAA CCGCGGCGTG GTGGCCGACG GGGCTGGAGA GGCAGAGGCG 2 520 
55 GAGTCATCCG AGAGAGGTGT ATCAGTGGCT CTGCACAATA CCCAGTGTCG CCACATCATA 2580 
TCCTGCTGAA TAACCACACA TGTGTACTGT CGTTAAATAA ATCATTGGTC ACGCGAACCC 2640 
GGAAAAAGAC GGCGAAAAAT TCACGGACAC ACGACTAGTA GTACCCAATA TACTCGGCAA 2700 

60 

AAACAGTGAC ACGTCGTTTT GCGTTGTCGG CCGGTGTTGT CGAGTCATTG TACTATGTTT 27 60 
TGTCGTTTCT TTCTTTTCTC CAAATCGACA AACCGTTTGT CTTTGGTTAA AAAACAGAAA 2820 
65 CATACAAAAT CAAATGAATG CATTCAAGGG CCGGTAATCC AATTCTGAGC CCAGGCTCAG 2 880 
CTACACCCGC CCTTACAAAA AAATCAAAAT AAATACTAGA AAAATTCAAA AAATTCCAAT 2940 
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TTGTTTGTGC GTGGTAGATA ATTTGATGCG TGAGGTACGC TTC AATTTTC AAATTATTTG 3 000 
GACATCTGAG CAGCTCTCAG CAAAAAAGAC AAATTCGGGG TCTGTAAAAA TGTTTACTGT 3 060 

5 

TCATGCACTG TTCTGACCCG ATTTGTC TTT TTTGCTGAGA GCTTCTCAGA AGTCCAAATG 3120 
AGCTAAAATT TTGAGCGGAG CTTACGTGAT AAAATGTCTA TCATGCAAAA AAGGATTGGA 3180 
10 ATTTTTTGAA TTTT'TTTTAT TTTTTGTGAT TTGTTTCCTG GACGGGTGCA GATAAGCCTG 3 240 
GGCACCGAAA CGCCGCACTC AGGCTCATCC TTTTCTATAA AAGAAAAGAA ATACATACAA 3 3 00 
TTTCCCTCTG TTTTTTGAGC AAGGGGC ACC ACCCACCAAA GAGTTTTCAA CTCACATGGT 3 3 60 

15 

ATTAGAGCAT CTACAGCCGG GCGTCTCAAA CCAGCCTCAT ACGCTTGAGC GGGTCGCCTT 3 42 0 
GGTCACGATT TTTTGACCCA GACGGGCCCC TCAAACGGTC CTTAAACGCC CAGGCTGACC 3 480 
20 GACAACCCAC ATATCCAGCC CAAATATGGG GTGGATATGG GGGCGCCCGG GCACGCCAGC 3 54 0 
CCGCGGACAC CACACATCTT CAGTTTCTAA TTTGAGATAT CCGGATGTGG AATGCGTTTT 3600 
TGAGGGGTGA CCGGTCCCTG TCCGTGGATG CGCCCGGACG TTTG AG GGGT TGGATTTGCC 3 660 

25 

AAGTCTGATT AG AG ATGC TC TTAGGTGTTC CACCCCCATC CCTTGATGGC TAGGGCAAAC 372 0 
TCTCCCCTCC AAACTTTGTC GGCGAGCCTG TGGATTCTTC TCTCCTCTGC CCGCTGCTCC 37 8 0 

3 0 GGCGGCTGAT GGCGGGGAGG AGAATCCCGG TGTCTTCGCT TGGTTAGTTG TTTAAGTTAC 3 84 0 

GTACTTTTTT AGTCCTCGCA GGTGCGGCGT TCGGACGTAT GGTCGTGCTT CTTTTTTGAG 3 90 0 
TTTGTCTTCC GGGCTCTGAT CCTCCTCGAG TTCGTCCATC TGGACGTACT CGACGGAGCT 3 96 0 

35 

CCGGCATAGA TTCCTATCAT CGTCTTGGTG AGGTGAGGTT ATGGTTTCTT GTCATGTGGG 402 0 
CAGATTTGGT GCCAGATGCT TCATATCTAT TCAAGGGTTC AGCGGCAACA ACTGCGGCTC 408 0 

4 0 CAGAGCGATG GTCCTTAAGG GCACGTGCAC GAAGACTTCA CGGCTGTTAT CGACAAGGTC 414 0 

AAGCCGGCTC CGATAGGGGA GCAGCGACAG CGGCGCGTCA ACCGCTCGTT CTGGCGGCAG 4200 
TAGTGGTCGT TCGGTGCTCT CGGAACCTCG ATGTAATTTT TATGATTTTA GAG ATGC TTT 4260 

45 

GTACTTCCGA TCGATGAACT CTGATAATAG ATATCTCTTC TCTCGCAAAA AAAGAGAGTT 43 2 0 
TTCAACTGAA AACAAAAGAG TTTCACTAGT TCTTCTTTTA GAAACAGAGT TTCACTAGCA 43 80 
50 CTTTTTTTTG CGAGAAGTCG AG TTTC ACTA AGTACTAAAC CCACGCAATT ATTCTCAAAA 444 0 
AAAAAACCCA CGCAACTGTC TGGATCCATC TTCGTTTTTT CCCCGAGAAT CGTCTGGATC 4500 
CATTTTCGTG TGCGAGGCAT CCTCTCATTT TGCACGGCCC AGCTCTCTTC TCGCCGGCGT 4560 

55 

ACGCTGCTAC ATGTCGG C AC TCCACGCAAA CAAAAAGAAG CCCAACCGAA AACGCACGCG 462 0 
CCTTTCCAGG CTCACCACGG AAAAAAATAC CACGCGCCGC TCACGAGCAA ACCGTGACAA 4680 
60 CAGCCAGCCA GATATGGCAA CGGAGGCACG GGCCGCACAC AGCCACTGAA AACCGCAGCT 474 0 
GCTCTTCCGT CCGTCCGTCC CTCCGCCCGT CCGCGCCACT CCACTCGCCT TGCCCCACTC 4800 
CCACTCTTCT CTCCCCGCGC ACACCGAGTC GGCACCGGCT CATCACCCAT CACCTCGGCC 4860 
TCGGCCACCG GCAAACCCCC CGATCCGCTT TTGCAGGCAG CGCACTAAAA CCCCGGGGAG 4920 



65 
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CGCGCCCCGC GGCAGCAGCA GCACCGCAGT GGGAGAGAGA GGCTTCGCCC CGGCCCGCAC 49 80 
CGAGCGGGGC GATCCACCGT CCGTGCGTCC GCACCTCCTC CGCCTCCTCC CCTGTCCCGC 5040 
5 GCGCCCACAC CCATGGCGGC GACGGGCGTC GG 507 2 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1706 base pairs 

1 0 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: iriticum tauschii 
2 0 (F) TISSUE TYPE: Endosperm 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATIONS. .1706 

2 5 (D) OTHER INFORMATION:/product= "partial cDNA for 

hexaploid wheal DBE" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

3 0 GCT GTG TCG AAG CTT GAC TAT TTG AAG GAG CTT GGA GTT AAT TGT ATT 

Ala Val Ser Lys Leu Asp Tyr Leu Lys Glu Leu Gly Val Asn Cys lie 
15 10 15 

GAA TTA ATG CCC TGC CAT GAG TTC AAC GAG CTG GAG TAC TCA ACC TCT 
3 5 Glu Leu Met Pro Cys His Glu Phe Asn Glu Leu Glu Tyr Ser Thr Ser 

20 25 30 

TCT TCC AAG ATG AAC TTT TGG GGA TAT TCT ACC ATA AAC TTC TTT TCA 
Ser Ser Lys Met Asn Phe Trp Gly Tyr Ser Thr He Asn Phe Phe Ser 
40 35 40 45 

CCA ATG ACG AGA TAC ACA TCA GGC GGG ATA AAA AAC TGT GGG CGT GAT 
Pro Met Thr Arg Tyr Thr Ser Gly Gly He Lys Asn Cys Gly Arg Asp 
50 55 60 

45 

GCC ATA AAT GAG TTC AAA ACT TTT GTA AGA GAG GCT CAC AAA CGG GGA 
Ala He Asn Glu Phe Lys Thr Phe Val Arg Glu Ala His Lys Arg Gly 
65 70 75 80 

50 ATT GAG GTG ATC CTG GAT GTT GTC TTC AAC CAT ACA GCT GAG GGT AAT 
He Glu Val He Leu Asp Val Val Phe Asn His Thr Ala Glu Gly Asn 
85 90 95 

GAG AAT GGT CCA ATA TTA TCA TTT AGG GGG GTC GAT AAT ACT ACA TAC 
55 Glu Asn Gly Pro He Leu Ser Phe Arg Gly Val Asp Asn Thr Thr Tyr 
100 105 HO 

TAT ATG CTT GCA CCC AAG GGA GAG TTT TAT AAC TAT TCT GGC TGT GGG 
Tyr Met Leu Ala Pro Lys Gly Glu Phe Tyr Asn Tyr Ser Gly Cys Gly 
60 115 120 125 



96 



144 



192 



240 



288 



336 



384 
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AAT ACC 
Asn Thr 
130 

5 TGT TTA 
Cys Leu 
145 

GAT CTT 
10 Asp Leu 



AAC GTG 
Asn Val 

15 

CCT CTT 
Pro Leu 

20 

CTT GGA 
Leu Gly 
210 

25 TAT CAA 
Tyr Gin 
225 

GGG AAG 
3 0 Gly Lys 



TTT GCT 
Phe Ala 

35 

CAG GCA 
Gin Ala 

40 

CAT GAT 
His Asp 
290 

45 AAT TTA 
Asn Leu 
305 

AGC TGG 
50 Ser Trp 



AGA TTG 
Arg Leu 

55 

TCT CAA 
Ser Gin 

60 

AAA GGG 
Lys Gly 
370 



TTC AAC TGT 
Phe Asn Cys 



AGA TAC TGG 
Arg Tyr Trp 



GCA TCC ATA 
Ala Ser lie 
165 

TAT GGA GCT 
Tyr Gly Ala 
180 

GTT ACT CCA 
Val Thr Pro 
195 

GGC GTC AAG 
Gly Val Lys 



GTA GGT CAA 
Val Gly Gin 



TAC CGG GAC 
Tyr Arg Asp 
245 

GGT GGT TTT 
Gly Gly Phe 
260 

GGA GGA AGG 
Gly Gly Arg 
275 

GGA TTT ACA 
Gly Phe Thr 



CCA AAT GGG 
Pro Asn Gly 



AAT TGT GGG 
Asn Cys Gly 
325 

AGG AAG AGG 
Arg Lys Arg 
340 

GGA GTT CCA 
Gly Val Pro 
355 

GGC AAC AAC 
Gly Asn Asn 



AAT CAT CCT 
Asn His Pro 
135 

GTG ATG GAA 
Val Met Glu 
150 

ATG ACC AGA 
Met Thr Arg 



CCA ATA GAA 
Pro lie Glu 



CCA CTT ATT 
Pro Leu lie 
200 

CTC ATT GCT 
Leu lie Ala 
215 

TTC CCT CAC 
Phe Pro His 
230 

ATT GTG CGC 
lie Val Arg 



GCC GAA TGT 
Ala Glu Cys 



AAA CCT TGG 
Lys Pro Trp 
280 

CTG GGT GAT 
Leu Gly Asp 
295 

GAG AAC AAT 
Glu Asn Asn 
310 

GAG GAA GGA 
Glu Glu Gly 



CAG ATG CGC 
Gin Met Arg 



ATG TTT TAC 
Met Phe Tyr 
360 

AAT ACA TAC 
Asn Thr Tyr 
375 



101 - 

GTG GTT CGT 
Val Val Arg 



ATG CAT GTT 
Met His Val 
155 

GGT TCC AGT 
Gly Ser Ser 
170 

GGT GAC ATG 
Gly Asp Met 
185 

GAC ATG ATC 
Asp Met lie 



GAA GCA TGG 
Glu Ala Trp 



TGG AAT GTT 
Trp Asn Val 
235 

CAA TTC ATT 
Gin Phe He 
250 

CTT TGT GGA 
Leu Cys Gly 
265 

CAC AGT ATC 
His Ser He 



TTG GTA ACA 
Leu Val Thr 



AGA GAT GGA 
Arg Asp Gly 
315 

GAA TTC GCA 
Glu Phe Ala 
330 

AAT TTC TTT 
Asn Phe Phe 
345 

ATG GGC GAT 
Met Gly Asp 



TGC CAT GAT 
Cys His Asp 



CAA TTC ATT 
Gin Phe He 
140 

GAT GGT TTT 
Asp Gly Phe 



CTG TGG GAT 
Leu Trp Asp 



ATC ACA ACA 
He Thr Thr 
190 

AGC AAT GAC 
Ser Asn Asp 
205 

GAT GCA GGA 
Asp Ala Gly 
220 

TGG TCT GAG 
Trp Ser Glu 



AAA GGC ACT 
Lys Gly Thr 



AGT CCA CAC 
Ser Pro His 
270 

AAC TTT GTA 
Asn Phe Val 
285 

TAT AAT AAC 
Tyr Asn Asn 
300 

GAA AAT CAC 
Glu Asn His 



AGA TTG TCT 
Arg Leu Ser 



GTT TGT CTC 
Val Cys Leu 
350 

GAA TAT GGC 
Glu Tyr Gly 
365 

TCT TAT GTC 
Ser Tyr Val 
380 



GTA GAT 43 2 
Val Asp 



CGT TTT 480 
Arg Phe 
160 

CCA GTT 52 8 

Pro Val 

175 

GGG ACA 57 6 
Gly Thr 



CCA ATT 62 4 
Pro He 



GGC CTC 67 2 
Gly Leu 



TGG AAT 720 
Trp Asn 
240 

GAT GGA 7 68 

Asp Gly 

255 

CTA TAC 816 
Leu Tyr 



TGT GCA 8 64 
Cys Ala 



AAG TAC 912 
Lys Tyr 

AAT CTT 960 
Asn Leu 
320 

GTC AAA 1008 

Val Lys 

335 

ATG GTT 1056 
Met Val 



CAC ACA 1104 
His Thr 



AAT TAT 1152 
Asn Tyr 
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TTT CGC 
Phe Arg 
385 

5 

TGC CTC 
Cys Leu 



10 GAC TTT 
Asp Phe 



AAG CCT 
15 Lys Pro 



GAT GAA 
Asp Glu 
20 450 

CCG GCC 
Pro Ala 
465 

25 

GTG GTG 
Val Val 



3 0 TTA CCT 
Leu Pro 



TCC AAC 
3 5 Ser Asn 



CGC CCT 
Arg Pro 
40 530 

ATG TAG 

Met 

545 

45 

GTG ATC 
Val lie 



TGG GAT AAA 
Trp Asp Lys 



ATG ACC AAA 
Met Thr Lys 
405 

CCA ACG GCC 
Pro Thr Ala 
420 

GAT TGG TCT 
Asp Trp Ser 
435 

AGA CAG GGC 
Arg Gin Gly 



GTT GTT GAG 
Val Val Glu 



GAC ACA GGC 
Asp Thr Gly 
485 

GAT CGC GCT 
Asp Arg Ala 
500 

CTC TAC CCC 
Leu Tyr Pro 
515 

GAT GTT TGA 
Asp Val * 



TCC TTT GGC 
Ser Phe Gly 



TAT TCG ATA 
Tyr Ser lie 
565 



AAA GAA CAA 
Lys Glu Gin 
390 

TTC CGC AAG 
Phe Arg Lys 



GAA CGG CTG 
Glu Arg Leu 



GAG AAT AGC 
Glu Asn Ser 
440 

GAG ATC TAT 
Glu lie Tyr 
455 

CTC CCA GAG 
Leu Pro Glu 
470 

AAG CCA GCA 
Lys Pro Ala 



CTC ACC ATA 
Leu Thr lie 



ATG CTC AGC 
Met Leu Ser 
520 

GAG ACA AAT 
Glu Thr Asn 
535 

GTA TTA TCA 
Val Leu Ser 
550 

GCG GCC GCG 
Ala Ala Ala 



TAC TCT GAC 
Tyr Ser Asp 
395 

GAG TGC GAG 
Glu Cys Glu 
410 

CAG TGG CAT 
Gin Trp His 
425 

CGA TTC GTT 
Arg Phe Val 



GTG GCC TTC 
Val Ala Phe 



CGC GCA GGG 
Arg Ala Gly 
475 

CCA TAT GAC 
Pro Tyr Asp 
490 

CAC CAG TTC 
His Gin Phe 
505 

TAC TCA TCG 
Tyr Ser Ser 



ATA TAC AGT 
lie Tyr Ser 



GTG TGC ACA 
Val Cys Thr 
555 

AA 



TTG CAC AGA 
Leu His Arg 



GGT CTT GGC 
Gly Leu Gly 



GGT CAT CAG 
Gly His Gin 
430 

GCC TTT TCC 
Ala Phe Ser 
445 

AAC ACC AGC 
Asn Thr Ser 
460 

CGC CGG TGG 
Arg Arg Trp 



TTC CTC ACC 
Phe Leu Thr 



TCT CAT TTC 
Ser His Phe 
510 

GTC ATC CTA 
Val lie Leu 
525 

AAA TAA TAT 
Lys * Tyr 
540 

ATT GCT CTA 
lie Ala Leu 



TTC TGC 1200 
Phe Cys 
400 

CTT GAG 1248 

Leu Glu 

415 

CCT GGG 1296 
Pro Gly 



ATG AAA 1344 
Met Lys 



CAC TTA 1392 
His Leu 



GAA CCG 1440 
Glu Pro 
480 

GAC GAC 1488 

Asp Asp 

495 

CTC AAC 153 6 
Leu Asn 



GTA TTG 1584 
Val Leu 



GTC TAT 163 2 
Val Tyr 



TTG CCA 1680 
Leu Pro 
560 

1706 



50 (2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9289 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
5 5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

60 

(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

(ix) FEATURE: 
5 (A) NAME/ KEY: CDS 

(B) LOCATION: 1 ..9289 

(D) OTHER INFORMATION :/product= "genomic sequence of DBE" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

10 

CGG GAC CGT CCC TTG GCA ACT TGG GTT ACG TTG GGA CCT GAC GCT TCG 48 
Arg Asp Arg Pro Leu Ala Thr Trp Val Thr Leu Gly Pro Asp Ala Ser 
570 575 580 

15 CTT ATC CGG TGT GCC CTG AGA CGA GAT ATG TGC AGC TCC TAT CGG ATT 9 6 
Leu He Arg Cys Ala Leu Arg Arg Asp Met Cys Ser Ser Tyr Arg He 
585 590 595 600 

TGT CGG CAC ATT CGG CGG CTT TGC TGG TCT TGT TTT ACC ATT GTC GAA 144 
20 Cys Arg His He Arg Arg Leu Cys Trp Ser Cys Phe Thr He Val Glu 

605 610 615 

ATG TCT TAT AAA CCG GGA TTC CGA GAC TGA TCG GGT CTT CCC GGG AGA 192 
Met Ser Tyr Lys Pro Gly Phe Arg Asp * Ser Gly Leu Pro Gly Arg 
25 620 625 630 

AGG TTT ATC CTT CGT TGA CCG TGA GAG CTT ATA ATG GGC TAA GTT GGG 2 40 

Arg Phe He Leu Arg * Pro * Glu Leu He Met Gly * Val Gly 
635 640 645 

30 

ACA CCC CTG CAG GGT ATT ATC TTT CGA AAG CCG TGC CCG CGG TTA TGA 2 88 

Thr Pro Leu Gin Gly He He Phe Arg Lys Pro Cys Pro Arg Leu * 
650 655 660 

3 5 GGC AGA TGG GAA TTT GTT AAT GTC CGA TTG TAG AGA ACC TGT CAC TTG 33 6 
Gly Arg Trp Glu Phe Val Asn Val Arg Leu * Arg Thr Cys His Leu 
665 670 675 680 

ACT TAA TTT AAA ATT CAT CAA CCG TGT GTG TAG CCG TGA TGG TCT CTT 3 84 
40 Thr * Phe Lys He His Gin Pro Cys Val * Pro * Trp Ser Leu 

685 690 695 

TTC GGC GGA GTC CGG GAA GTG AAC ACG GTT TGA GTT ATG CAT GAA CGT 43 2 
Phe Gly Gly Val Arg Glu Val Asn Thr Val * Val Met His Glu Arg 
45 700 705 710 

AAG TAG TTT CAG GAT CAC TCC TTG ATC ACT TCT AGC TCC GCG ACC GTT 480 

Lys * Phe Gin Asp His Ser Leu He Thr Ser Ser Ser Ala Thr Val 

715 720 725 

50 

GCG TTG TTT CTC TTC TCG CTC TCA TTT GCG TAT GTT AGC CAC CAT ATA 52 8 

Ala Leu Phe Leu Phe Ser Leu Ser Phe Ala Tyr Val Ser His His He 

730 735 740 

5 5 TGC TTA GTG TCT GCT GCA GCT CCA CCT CAT TAC CCC TTC CTT TCC TAT 57 6 
Cys Leu Val Ser Ala Ala Ala Pro Pro His Tyr Pro Phe Leu Ser Tyr 
745 750 755 760 

AAG CTT AAA TAG TCT TGA TCT CGC GGG TGT GAG ATT GCT GAG TCC TCG 624 
60 Lys Leu Lys * Ser * Ser Arg Gly Cys Glu He Ala Glu Ser Ser 

765 770 775 
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TGA CTT AC A GAT TCT ACC AAA AC A GTT GCA GGT GTC GAC GAT GCC AGT 672 

* Leu Thr Asp Ser Thr Lys Thr Val Ala Gly Val Asp Asp Ala Ser 
780 785 790 

5 GCA GGT GAC GCA ACC GAG CTC AAG TGG GAG TTC GAC GAG GAA CGT GGT 720 
Ala Gly Asp Ala Thr Glu Leu Lys Trp Glu Phe Asp Glu Glu Arg Gly 
795 800 805 

CGT TAC TAT GTT TCT TTT CCT GAT GAT CAG TAG TGG AGC CCA GTT GGG 768 
10 Arg Tyr Tyr Val Ser Phe Pro Asp Asp Gin * Trp Ser Pro Val Gly 
810' 815 820 

ACG ATC GGG GAT CTA GCA TTT GGG GTT ATC TTA ATT TCT TTT AGA TTT 816 
Thr lie Gly Asp Leu Ala Phe Gly Val lie Leu lie Ser Phe Arg Phe 
15 825 830 • 835 840 

GAC CGT AAT CGG TCT ATG TGT GG A TTT TGG ATG ATG TAT GAA TTA TTT 864 

Asp Arg Asn Arg Ser Met Cys Gly Phe Trp Met Met Tyr Glu Leu Phe 

845 850 855 

20 

ATG TAT TGT GTG AAG TGG CGA TTG TAA GCC AAC TCT CGT TAT CCC ATT 912 

Met Tyr Cys Val Lys Trp Arg Leu * Ala Asn Ser Arg Tyr Pro lie 

860 865 870 

2 5 CTT GTT CAT TAC ATG GGA TTG TGT GAA GAT GAC CCT TCT TGC GAC AAA 960 

Leu Val His Tyr Met Gly Leu Cys Glu Asp Asp Pro Ser Cys Asp Lys 
875 880 885 

ACC AC A ATG CGG TTA TGC CTC TAA GTC GTG CCT CGA CAC GTG GGA GAT 1008 

3 0 Thr Thr Met Arg Leu Cys Leu * Val Val Pro Arg His Val Gly Asp 

890 895 900 

ATA GCC GCA TCG TGG GCG TTA CAC GCA AGT CTT CAT AGC AAC CAA AAC 1056 
lie Ala Ala Ser Trp Ala Leu His Ala Ser Leu His Ser Asn Gin Asn 
35 905 910 915 920 

TCC TCT CCG CAT TAC AAG CCA CCA ATC GCA GCC ACC ATG ACT TTC TTC 1104 

Ser Ser Pro His Tyr Lys Pro Pro He Ala Ala Thr Met Thr Phe Phe 
925 930 935 

40 

ACC ACT GTC AAT GCC ATG AAA ATC TAT ATG TAG AC A TGT CCC ATT GCA 1152 

Thr Thr Val Asn Ala Met Lys He Tyr Met * Thr Cys Pro He Ala 
940 945 950 

4 5 TCG GCA AGA AAG CGA AGC TTC ACG GCA CAC CTT CAT GAA GCC TCT CTG 1200 

Ser Ala Arg Lys Arg Ser Phe Thr Ala His Leu His Glu Ala Ser Leu 
955 960 965 

GCC GAA GAC AAG GAT GCG CCC GAC CGG ATC AAT TCC TAT CTA GAT ACC 1248 
50 Ala Glu Asp Lys Asp Ala Pro Asp Arg He Asn Ser Tyr Leu Asp Thr 
970 975 980 

TAG TGG AGC CAT GCG CCA ATA GCG GAG ATC TCC GAG AGG AAG ACC GGA 129 6 
* Trp Ser His Ala Pro He Ala Glu He Ser Glu Arg Lys Thr Gly 
55 985 990 995 1000 

ACT CGT CGG ACG TCG GCG TCC AAA TCG AGG AGG CCG GCA TGA AGC AC A 13 4 4 
Thr Arg Arg Thr Ser Ala Ser Lys Ser Arg Arg Pro Ala * Ser Thr 
1005 1010 1015 



60 



TCG AGG ATG GTG ATC CCC ATA CGG GTA GAT CGG GTC GGC CGC CAT CTC 13 9 2 
Ser Arg Met Val He Pro He Arg Val Asp Arg Val Gly Arg His Leu 
1020 1025 1030 
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40 



60 



ACA 


CCG 


AGA 


TTA 


GGA 


TGC 


TTA 


AAA 


CGG 


TTT 


TTT 


TGG 


CAC 


TAG 


CAT 


TAT 


1440 


Thr 


Pro 


Arg 


Leu 


Gly 


Cys 


Leu 


Lys 


Arg 


Phe 


Phe 


Trp 


His 


★ 


His 


Tyr 








1035 








1040 








1045 










TTT 


GCA 


TCA 


I <— 


GTT 


GGA 


GAG 


AAC 


n io 


AGA 


GAG 


CCC 


CAT 


TTC 


TTC 


CAC 


1488 


Phe 


Ala 


Ser 


OCX. 


Val 


Gly 


Glu 


Asn 


Met. 


Arg 


Glu 


Pro 


His 


Phe 


Phe 


His 






1050 








1055 








1060 










GGT 


TCT 


ACC 


TAT 


GGG 


ATC 


TTG 


TTC 


TGC 


TTG 


CAA 


CCG 


GGC 


CTC 


ACG 


GAA 


1536 


Gly 


Ser 


Thr 


Tyr 


Gly 


He 


Leu 


Phe 


Cys 


Leu 


Gin 


Pro 


Gly 


Leu 


Thr 


Glu 




1065 








1070 








1075 








1080 




AAC 


CCG 


CGC 


CAG 


CGG 


ACC 


CAC 


CCC 


ATG 


CTA 


GCA 


GGG 


CAC 


GGC 


ACC 


CGC 


1584 


Asn 


Pro 


Arg 


Gin 


Arg 


Thr 


His 


Pro 


Met 


Leu 


Ala 


Gly 


His 


Gly 


Thr 


Arg 












1085 








1090 








1095 




AGC 


GGC 


CGG 


TCC 


AAA 


TGG 


ACG 


GTG 


AGA 


ACC 


GCA 


ACG 


CGA 


CAC 


GCC 


CGG 


1632 


Ser 


Gly 


Arg 


Ser 


Lys 


Trp 


Thr 


Val 


Arg 


Thr 


Ala 


Thr 


Arg 


His 


Ala 


Arg 










1100 








1105 








1110 






CAC 


TGT 


CAG 


CAA 


AGC 


GAG 


AGC 


GCG 


CGC 


ACG 


GCA 


CAC 


GCA 


CGC 


TCG 


GAC 


1680 


His 


Cys 


Gin 


Gin 


Ser 


Glu 


Ser 


Ala 


Arg 


Thr 


Ala 


His 


Ala 


Arg 


Ser 


Asp 





10 



15 



20 

CAC TGT 
His Cys 

1115 1120 1125 

2 5 GAA CGG ACG GTG CGA TCG ATC CCT CCC CCC TCG CTC AAC CAC AGT AGT 17 2 8 

Glu Arg Thr Val Arg Ser He Pro Pro Pro Ser Leu Asn His Ser Ser 
1130 1135 1140 

ACC CTG CCA CAC TAT CAC GCA CGC ACT CGA GTC ACA CCT CCC ACG AAG 177 6 

3 0 Thr Leu Pro His Tyr His Ala Arg Thr Arg Val Thr Pro Pro Thr Lys 

1145 1150 1155 1160 

AAC CAA CAG GAG GCG CGG ATC CCA CCG ATA AAT AAC CCC GCC TCG CCG 182 4 
Asn Gin Gin Glu Ala Arg He Pro Pro He Asn Asn Pro Ala Ser Pro 
35 1165 1170 1175 

CTC CTC CCC AAA ATC AAT CAC CGA TCG CTC GGG GTT CCC GGC ATG ACG 187 2 
Leu Leu Pro Lys He Asn His Arg Ser Leu Gly Val Pro Gly Met Thr 
1180 1185 1190 



ATG ATG GCC ATG GCC AAG GCG CCC TGC CTC TGC GCG CGC CCG TCC CTC 19 2 0 
Met Met Ala Met Ala Lys Ala Pro Cys Leu Cys Ala Arg Pro Ser Leu 
1195 1200 1205 



45 GCC GCG CGC GCG AGG CGG CCG GGG CCG GGG CCG GCG CCG CGC CTG CGA 19 6 8 
Ala Ala Arg Ala Arg Arg Pro Gly Pro Gly Pro Ala Pro Arg Leu Arg 
1210 1215 1220 

CGG TGG CGA CCC AAT GCG ACG GCG GGG AAG GGG GTC GGC GAG GTG TGC 2016 
50 Arg Trp Arg Pro Asn Ala Thr . Ala Gly Lys Gly Val Gly Glu Val Cys 
1225 1230 1235 1240 

GCC GCG GTT GTC GAG GCG GCG ACG AAG GCC GAG GAT GAG GAC GAC GAC 2064 
Ala Ala Val Val Glu Ala Ala Thr Lys Ala Glu Asp Glu Asp Asp Asp 
55 1245 1250 1255 

GAG GAG GAG GCG GTG GCG GAG GAC AGG TAC GCG CTC GGC GGC GCG TGC 2112 
Glu Glu Glu Ala Val Ala Glu Asp Arg Tyr Ala Leu Gly Gly Ala Cys 
1260 1265 1270 



AGG GTG CTC GCC GGA ATG CCC GCG CCG CTG GGC GCC ACC GCG CTC GCC 2160 
Arg Val Leu Ala Gly Met Pro Ala Pro Leu Gly Ala Thr Ala Leu Ala 
1275 1280 1285 
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GGC GGG GTC AAT TTC GCC GTC TAC TCC GGT GGA GCC ACC GCC GCG GCG 2 208 
Gly Gly Val Asn Phe Ala Val Tyr Ser Gly Gly Ala Thr Ala Ala Ala 
1290 1295 1300 

5 CTC TGC CTC TTC ACG CCA GAA GAT CTC AAG GCG GTG GGG TTG CCT CCC 22 56 
Leu Cys Leu Phe Thr Pro Glu Asp Leu Lys Ala Val Gly Leu Pro Pro 
1305 1310 1315 1320 

GAG TAG AGT TCA TCA GCT TTG CGT GCG CCG CGC GCC CCC TTT TCT GGC 23 04 
10 Glu * Ser Ser Ser Ala Leu Arg Ala Pro Arg Ala Pro Phe Ser Gly 

1325 1330 1335 

CTG CGA TTT AAG TTT TGT ACT GGG GGA AAT GCT GCA GGA TAG GGT GAC 2 3 52 
Leu Arg Phe Lys Phe Cys Thr Gly Gly Asn Ala Ala Gly * Gly Asp 
15 1340 1345 1350 

GGA GGA GGT TTC CCT TGA CCC CCT GAT GAA TCG GAC TGG GAA CGT GTG 2 400 
Gly Gly Gly Phe Pro * Pro Pro Asp Glu Ser Asp Trp Glu Arg Val 
1355 1360 1365 



20 



60 



GCA TGT CTT CAT TGA AGG CGA GCT GCA CGA CAT GCT TTA CGG GTA CAG 244 8 
Ala Cys Leu His * Arg Arg Ala Ala Arg His Ala Leu Arg Val Gin 
1370 1375 1380 



25 GTT CGA CGG CAC CTT TGC TCC TCA CTG CGG GCA CTA CCT TGA TAT TTC 2 496 
Val Arg Arg His Leu Cys Ser Ser Leu Arg Ala Leu Pro * Tyr Phe 
1385 1390 1395 1400 

CAA TGT CGT GGT GGA TCC TTA TGC TAA GGT GAT CAT ACT TTA GCT TTA 2 544 

3 0 Gin Cys Arg Gly Gly Ser Leu Cys * Gly Asp His Thr Leu Ala Leu 

1405 1410 1415 

CCT GCA TCT TGG TAT TTA CAG TAG AAA TTG TTA CGT GGA CCC TTA TTT 2 592 
Pro Ala Ser Trp Tyr Leu Gin * Lys Leu Leu Arg Gly Pro Leu Phe 
35 1420 1425 1430 

GTT GCC TTT TGT GTT GCT CTA GGC AGT GAT AAG CCG AGG GGA GTA TGG 2 640 
Val Ala Phe Cys Val Ala Leu Gly Ser Asp Lys Pro Arg Gly Val Trp 
1435 1440 1445 

40 

CGT TCC GGC GCG TGG TAA CAA TTG CTG GCC TCA GAT GGC TGG CAT GAT 2 688 
Arg Ser Gly Ala Trp * Gin Leu Leu Ala Ser Asp Gly Trp His Asp 
1450 1455 1460 

4 5 CCC TCT TCC ATA TAG CAC GGT ATG CCT GAT TGC TGA AAA TAT TGG CTG 273 6 

Pro Ser Ser lie * His Gly Met Pro Asp Cys * Lys Tyr Trp Leu 
1465 1470 1475 1480 

CAT TTG TTT CTC TCT TTT TCT CAT ATT TTT CTC CTG TCT TTC ACT TGT 2 7 84 
50 His Leu Phe Leu Ser Phe Ser His lie Phe Leu Leu Ser Phe Thr Cys 

1485 1490 1495 

ACT ACA TTG CCT CAG AC A GTC ATG ATC AAA GAG AGC AGT GTC ATT AG A 283 2 
Thr Thr Leu Pro Gin Thr Val Met lie Lys Glu Ser Ser Val lie Arg 
55 1500 1505 1510 

CAT TTG TAG TTG TCT GCT GAC TTT GAC CAA AAC TTG TAA TTT ACT GTT 2 880 
His Leu * Leu Ser Ala Asp Phe Asp Gin Asn Leu * Phe Thr Val 
1515 1520 1525 



GTT AAA GGT CCT TGA ATC ATA TTT TTT TAT AAT ATT ATG TTT GCA AGT 2 928 
Val Lys Gly Pro * lie He Phe Phe Tyr Asn He Met Phe Ala Ser 
1530 1535 1540 
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GGA AGT AAA GTG AAA TTG CAT CTA GTA TTT GTT GTT GCT GTC TTA GTC 2976 
Gly Ser Lys Val Lys Leu His Leu Val Phe Val Val Ala Val Leu Val 
1545 1550 1555 1560 

5 GTT TAA TTG GAC ATG CAG TAA AAA GGT TTG CAT CTG CAG TTT GAT TGG 3024 
Val * Leu Asp Met Gin * Lys Gly Leu His Leu Gin Phe Asp Trp 
1565 1570 1575 

GAA GGC GAC CTA CCT CTA AG A TAT CCT CAA AAG GAC CTG GTA ATA TAT 3072 
10 Glu Gly Asp Leu Pro Leu Arg Tyr Pro Gin Lys Asp Leu Val lie Tyr 
1580 1585 1590 

GAG ATG CAC TTG CGT GGA TTC ACG AAG CAT GAT TCA AGC AAT GTA GAA 3120 
Glu Met His Leu Arg Gly Phe Thr Lys His Asp Ser Ser Asn Val Glu 
15 1595 1600 1605 

CAT CCG GGT ACT TTC ATT GGA GCT GTG TCG AAG CTT GAC TAT TTG AAG 3168 

His Pro Gly Thr Phe lie Gly Ala Val Ser Lys Leu Asp Tyr Leu Lys 

1610 1615 1620 

20 

GTA CAG CTG TAC TTG CTG ACT ACA TAG GAT AAT TTT TAA AGA AAG CTA 3216 

Val Gin Leu Tyr Leu Leu Thr Thr * Asp Asn Phe * Arg Lys Leu 

1625 1630 1635 1640 

2 5 CAT ATT AGC CAG AAT TTG GGT TAT TAC AAA AAC TAC TGC ATA CTA TAG 3 264 

His lie Ser Gin Asn Leu Gly Tyr Tyr Lys Asn Tyr Cys lie Leu * 
1645 1650 1655 

CAG TTA CAT GCT CAT TAT CGA GGA GAT GCT CAC ACG CAT CTT ATT TGG 3 312 

3 0 Gin Leu His Ala His Tyr Arg Gly Asp Ala His Thr His Leu lie Trp 

1660 1665 1670 

ATT TAA TAC CCA ATT CTG TTT TGA TAT TGG ACT GTT CCC TCT ACA GGA 3 3 60 
lie * Tyr Pro He Leu Phe * Tyr Trp Thr Val Pro Ser Thr Gly 
35 1675 1680 1685 

GCT TGG AGT TAA TTG TAT TGA ATT AAT GCC CTG CCA TGA GTT CAA CGA 3 408 

Ala Trp Ser * Leu Tyr * He Asn Ala Leu Pro * Val Gin Arg 

1690 1695 1700 

40 

GCT GGA GTA CTC AAC CTC TTC TTC CAA GTA AGG ACA TGA ATT TAG TAT 3456 

Ala Gly Val Leu Asn Leu Phe Phe Gin Val Arg Thr * He * Tyr 

1705 1710 1715 1720 

45 TAG CCT GCC AGC ACT GTT TGA GTG AGA GTT CAT ACA CAT TTT GTG CCT 3 504 
* Pro Ala Ser Thr Val * Val Arg Val His Thr His Phe Val Pro 
1725 1730 1735 

GCA TAA CTG ATA TTT GTT CAA ACT ATT TTT TTT AGC AGT CAC TCA ACA 3 552 
50 Ala * Leu He Phe Val Gin Thr He Phe Phe Ser Ser His Ser Thr 
1740 1745 1750 

GTT TTA CAT ATA TAT ATA ATA TAG ACT ATT CGT CAC CCT GGG TGA GGA 3 600 
Val Leu His He Tyr He He * Thr He Arg His Pro Gly * Gly 
55 1755 1760 1765 

ATA GTT ATT CTT CAC CCA CCT CTA TTT TAA CAT CTA TGC ACC GTA ATT 3 648 
He Val He Leu His Pro Pro Leu Phe * His Leu Cys Thr Val He 
1770 1775 1780 



60 



TTA CGT TTC GTA AAT TTG TCT TAT TTT AGA GAT AAA AAG AGA ACG TAA 3 69 6 
Leu Arg Phe Val Asn Leu Ser Tyr Phe Arg Asp Lys Lys Arg Thr * 
1785 1790 1795 1800 
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GAA AAC CTA TAA TCG TCG TAA AAA AAA ATA TGT TAC GTA AAA TTA CAA 37 44 
Glu Asn Leu * Ser Ser * Lys Lys lie Cys Tyr Val Lys Leu Gin 
1805 1810 1815 

5 ATG TAA AAA CAT AGT GTA AAA TGT AC A TAA AAT AC A TTT TTT GAC CTA 379 2 
Met * Lys His Ser Val Lys Cys Thr * Asn Thr Phe Phe Asp Leu 
1820 1825 1830 

TAT TTT TTT TGT TAA TGC CAA ATT TTA TAC AGT AAA TCA ATA TGA ATG 3 840 
10 Tyr Phe Phe Cys * Cys Gin He Leu Tyr Ser Lys Ser He * Met 
1835 1840 1845 

TAA CTA TTT GTA TTT CAA ATG TAA TTT ATT TAT GAA ATG GTC GTA AG A 3 88 8 

* Leu Phe Val Phe Gin Met * Phe He Tyr Glu Met Val Val Arg 
15 1850 1855 I860 

TTA CCT CGG GTG AAG AAT AAC TTA TTC TGC ACC CTG GGT GAT GAA TAG 393 6 
Leu Pro Arg Val Lys Asn Asn Leu Phe Cys Thr Leu Gly Asp Glu * 
1865 1870 1875 1880 

20 

TAA CAC TAT ATA TAT ATA TAT ATA TAT ATA TAT ATA TAT ATA CCG GCT 39 84 

* His Tyr He Tyr He Tyr He Tyr He Tyr He Tyr He Pro Ala 

1885 1890 1895 

2 5 GCT GCT AAT GAT GTT AAT ATT TCG CAA GTA CCT AAG CTG GAT TTT TCT 403 2 

Ala Ala Asn Asp Val Asn He Ser Gin Val Pro Lys Leu Asp Phe Ser 
1900 1905 1910 

CCA TGA GAC ATC AAT CCA TAA TTG AAA TTG GTC ACG ACA GTT GAA TAG 408 0 

3 0 Pro * Asp He Asn Pro * Leu Lys Leu Val Thr Thr Val Glu * 

1915 1920 - 1925 

TTG ATA GCT GAA AAT GAA ATC CAG CAT GCT ACT GTC TTG CCA TCT CCA 412 8 
Leu He Ala Glu Asn Glu He Gin His Ala Thr Val Leu Pro Ser Pro 
35 1930 1935 1940 

GAC TTG CTA ACA TGA ATT TTG TCT GCC TAC CTG TCA TTT GTA CCA ACG 417 6 

Asp Leu Leu Thr * He Leu Ser Ala Tyr Leu Ser Phe Val Pro Thr 

1945 1950 1955 I960 

40 

TTC CCA ATT GCC CTC TCA TTA TTC GTG TGT ACC ATG CAT ATG TGT TTT 422 4 

Phe Pro He Ala Leu Ser Leu Phe Val Cys Thr Met His Met Cys Phe 

1965 1970 1975 

45 AAC ATG ATT ATT GTT GGC TAT ATT TCT CTT TGG AAA CAT GAC TAA TTT 427 2 
Asn Met He He Val Gly Tyr He Ser Leu Trp Lys His Asp * Phe 
1980 1985 1990 

ATC ACC CGT TTT GTA TAA ACT GCT TGT TTT CAT ATC AGG ATG AAC TTT 4 3 20 
50 He Thr Arg Phe Val * Thr Ala Cys Phe His He Arg Met Asn Phe 
1995 2000 2005 

TGG GGA TAT TCT ACC ATA AAC TTC TTT TCA CCA ATG ACG AGA TAC ACA 43 68 
Trp Gly Tyr Ser Thr He Asn Phe Phe Ser Pro Met Thr Arg Tyr Thr 
55 2010 2015 2020 

TCA GGC GGG ATA AAA AAC TGT GGG CGT GAT GCC ATA AAT GAG TTC AAA 4416 
Ser Gly Gly He Lys Asn Cys Gly Arg Asp Ala He Asn Glu Phe Lys 
2025 2030 2035 2040 



60 



ACT TTT GTA AGA GAG GCT CAC AAA CGG GGA ATT GAG GTA AGC AAG TCG 44 64 
Thr Phe Val Arg Glu Ala His Lys Arg Gly He Glu Val Ser Lys Ser 
2045 2050 2055 
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TAC GAG TTA GTT GCT CCT TTT GAA CTT ATC AAT TTG ATG CGA AG A CAT 4512 

Tyr Glu Leu Val Ala Pro Phe Glu Leu lie Asn Leu Met Arg Arg His 
2060 2065 2070 

5 GTT ACT GCT AGG TGA TCC TGG ATG TTG TCT TCA ACC ATA CAG CTG AGG 4560 
Val Thr Ala Arg * Ser Trp Met Leu Ser Ser Thr lie Gin Leu Arg 
2075 2080 2085 

GTA ATG AGA ATG GTC CAA TAT TAT CAT TTA GGG GGG TCG ATA ATA CTA 460 8 
10 Val Met Arg Met Val Gin Tyr Tyr His Leu Gly Gly Ser lie lie Leu 
2090 2095 2100 

CAT ACT ATA TGC TTG CAC CCA AGG TGA CAG ATC TTT CTT GCT GCG TAA 465 6 
His Thr lie Cys Leu His Pro Arg * Gin lie Phe Leu Ala Ala * 
15 2105 2110 2115 2120 

TTG TTC TTT CAT AGA TGT ATA GAG CAT AGA TGT GTT ATG TAG TAG TTC 4704 
Leu Phe Phe His Arg Cys He Glu His Arg Cys Val Met * * Phe 
2125 2130 2135 



20 



60 



TTT TTC AAG GGG ATT ATG TTC ATG CAG GGA GAG TTT TAT AAC TAT TCT 47 52 
Phe Phe Lys Gly He Met Phe Met Gin Gly Glu Phe Tyr Asn Tyr Ser 
2140 2145 2150 



2 5 GGC TGT GGG AAT ACC TTC AAC TGT AAT CAT CCT GTG GTT CGT CAA TTC 4800 

Gly Cys Gly Asn Thr Phe Asn Cys Asn His Pro Val Val Arg Gin Phe 
2155 2160 2165 

ATT GTA GAT TGT TTA AGG TAC AGA TAT AC A TTT TAC TTC TAG AAC TAC 4848 

3 0 He Val Asp Cys Leu Arg Tyr Arg Tyr Thr Phe Tyr Phe * Asn Tyr 

2170 2175 2180 

TTT TTC ATT TCT TTT GCT GCT TGT CAT TTT GAT ATG ATT AAT TTG CAA 4 896 
Phe Phe He Ser Phe Ala Ala Cys His Phe Asp Met He Asn Leu Gin 
35 2185 2190 2195 2200 

GCT TGT GGG GGT AAA TCT TTT GGT CAG CAT ATT GTA TCT TTA AAT GTC 49 44 
Ala Cys Gly Gly Lys Ser Phe Gly Gin His He Val Ser Leu Asn Val 
2205 2210 2215 

40 

ACA AAT ACT AAT GTC CTG GTG CTT ATT GAT TTG GCA TCT TCA AAT TCT 499 2 
Thr Asn Thr Asn Val Leu Val Leu He Asp Leu Ala Ser Ser Asn Ser 
2220 2225 2230 

45 TCT CCA ATG AAA AGG GAA AAA TCT ACT GTA TGT CTC GTC AAC TAA TTT 5040 
Ser Pro Met Lys Arg Glu Lys Ser Thr Val Cys Leu Val Asn * Phe 
2235 2240 2245 

ACT TTT GTT TTG CAG ATA CTG GGT GAT GGA AAT GCA TGT TGA TGG TTT 5088 
50 Thr Phe Val Leu Gin He Leu Gly Asp Gly Asn Ala Cys * Trp Phe 
2250 2255 2260 

TCG TTT TGA TCT TGC ATC CAT AAT GAC CAG AGG TTC CAG GTA ATT TGT 513 6 
Ser Phe * Ser Cys He His Asn Asp Gin Arg Phe Gin Val He Cys 
55 2265 2270 2275 2280 

ATT TAT TGT TTG TTT GCG TGT TGC CTT TTC AGA AGA TTC TTA AAA GAA 5184 
He Tyr Cys Leu Phe Ala Cys Cys Leu Phe Arg Arg Phe Leu Lys Glu 
2285 2290 2295 



TGT TTC TTT TAC AAG TCT GTG GGA TCC AGT TAA CGT GTA TGG AGC TCC 523 2 
Cys Phe Phe Tyr Lys Ser Val Gly Ser Ser * Arg Val Trp Ser Ser 
2300 2305 2310 
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AAT AG A AGG TGA CAT GAT CAC AAC AGG GAC ACC TCT TGT TAC TCC ACC 52 80 
Asn Arg Arg * His Asp His Asn Arg Asp Thr Ser Cys Tyr Ser Thr 
2315 2320 2325 

5 ACT TAT TGA CAT GAT CAG CAA TGA CCC AAT TCT TGG AGG CGT CAA GGT 5 32 8 
Thr Tyr * His Asp Gin Gin * Pro Asn Ser Trp Arg Arg Gin Gly 
2330 2335 2340 

ACT TGT TTC ATC CAA CAC CTG TTG TCT GTG TGC ATT CAA TTG TTT TAA 537 6 
10 Thr Cys Phe lie Gin His Leu Leu Ser Val Cys lie Gin Leu Phe * 

2345 2350 2355 2360 

TAT GGT AAT GAT CAA TTT CCC AAT GTT GAT AAG GAA AAA AAA TGC AAG 5424 
Tyr Gly Asn Asp Gin Phe Pro Asn Val Asp Lys Glu Lys Lys Cys Lys 
15 2365 2370 2375 

TAG CTC TCT TTA TCT GCT TCT TGT GAG TTA TGC TAA AC A TGT AGA TAC 5472 

Leu Ser Leu Ser Ala Ser Cys Glu Leu Cys * Thr Cys Arg Tyr 
2380 2385 2390 

20 

TAC TAT ATT TCA ACT GTA TAT ACT TGA CAT ATT ATT GCT TCC TTG GGA 552 0 

Tyr Tyr lie Ser Thr Val Tyr Thr * His lie lie Ala Ser Leu Gly 
2395 2400 2405 

2 5 GGC TCT CTT ATT CCT TTC CCC CGT TGC AAT TAT AGC TCA TTG CTG AAG 55 68 

Gly Ser Leu lie Pro Phe Pro Arg Cys Asn Tyr Ser Ser Leu Leu Lys 
2410 2415 2420 

CAT GGG ATG CAG GAG GCC TCT ATC AAG TAG GTC AAT TCC CTC ACT GGA 5616 

3 0 His Gly Met Gin Glu Ala Ser lie Lys * Val Asn Ser Leu Thr Gly 

2425 2430 2435 2440 

ATG TTT GGT CTG AGT GGA ATG GGA AGG TAA GGT ACC TGT TAA AAG TTT 566 4 
Met Phe Gly Leu Ser Gly Met Gly Arg * Gly Thr Cys * Lys Phe 
35 2445 2450 2455 

GAA TGG CAA ATA CTG ATA GAA ATA TAA CTT ATA TTT GCG AC A TAT ATA 5712 
Glu Trp Gin lie Leu lie Glu lie * Leu lie Phe Ala Thr Tyr lie 
2460 2465 2470 



40 



60 



GAT AAA GCA AAA TAA TAC GCA TTC CAC CTG AAC TTT AAA GGG GCA CGC 57 60 
Asp Lys Ala Lys * Tyr Ala Phe His Leu Asn Phe Lys Gly Ala Arg 
2475 2480 2485 



45 AGA ATT ATC CCG CAT CTG TCT AC A AGA ATG ATA AC A CAT GTG CTG AAT 5 808 
Arg lie lie Pro His Leu Ser Thr Arg Met lie Thr His Val Leu Asn 
2490 2495 2500 

AGT GAA GTA CTA CTT CTC AAA TGT CTG AAT GAA CGC ACT AAC TCT TGT 5856 
50 Ser Glu Val Leu Leu Leu Lys Cys Leu Asn Glu Arg Thr Asn Ser Cys 
2505 2510 2515 2520 

GAG TGT CAA CCG AGC AAG AAA TAT TTG AGT TTT CTG CAA GAA ATT GTT 5904 
Glu Cys Gin Pro Ser Lys Lys Tyr Leu Ser Phe Leu Gin Glu lie Val 
55 2525 2530 2535 

CAT GTT GTG CTG TAT TAT ACT CCC TCC GTC CGA AAT TAT TTG TCG GAG 5952 
His Val Val Leu Tyr Tyr Thr Pro Ser Val Arg Asn Tyr Leu Ser Glu 
2540 2545 2550 



AAA TGG ATG TAT CTA GAC GTA TTT TAG TTC TAG ATA CAT CCA TTT TTA 6000 
Lys Trp Met Tyr Leu Asp Val Phe * Phe * lie His Pro Phe Leu 
2555 2560 2565 
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TCC ATT TCT GCA ACA AGT AGT TCC GGA CGG AGG GAG TAT CAT TTA ACA 

Ser He Ser Ala Thr Ser Ser Ser Gly Arg Arg Glu Tyr His Leu Thr 
2570 2575 2580 



6048 



5 AAT ATA TGC ATG TTC GAA GTA AAT CCC CAC GAA TAA GCA TAT AAG ACG 609 6 
Asn He Cys Met Phe Glu Val Asn Pro His Glu * Ala Tyr Lys Thr 
2585 2590 2595 2600 

ATA TTG CTT TTT GAC TTG CAA CAC CTA AAC CTC ATT GTT TTC TCC- TAG 6144 
10 He Leu Leu Phe Asp Leu Gin His Leu Asn Leu He Val Phe Ser * 

2605 2610 2615 

GAT TTT GGG TGT TCG AAG CAA GCA GCT GGT GAT ATT TAA TTT ACC TTT 619 2 
Asp Phe Gly Cys Ser Lys Gin Ala Ala Gly Asp He * Phe Thr Phe 
15 2620 2625 2630 

GCC TTT ATT TGT AGC TTG ATT TGA GGG TGC GGC AAA GGT TTT AGC TTA 6240 
Ala Phe He Cys Ser Leu He * Gly Cys Gly Lys Gly Phe Ser Leu 
2635 2640 2645 



GTA GTG TTT TGT AAA TTA TTA TAG TTT ATG TAT ATA CTC CTC ATT TGG 628 8 
Val Val Phe Cys Lys Leu Leu * Phe Met Tyr He Leu Leu He Trp 
2650 2655 2660 



2 5 GCA CTT CCG TAC TGG TCC CAT AGA AGA TAA AAA TGG AAT GAT GTC TGG 63 3 6 

Ala Leu Pro Tyr Trp Ser His Arg Arg * Lys Trp Asn Asp Val Trp 
2665 2670 2675 2680 

CCA ATA ATT GTT GAC AAC ACT GTT GCG CAT TTG ATT TTT ATC AGG GAA 63 84 

3 0 Pro He He Val Asp Asn Thr Val Ala His Leu He Phe He Arg Glu 

2685 2690 2695 

TGG AAA ATT GAA ATC GGT AAG AAA CAT TGC GAT ATT AAG CTT GTA TAT 643 2 
Trp Lys He Glu He Gly Lys Lys His Cys Asp He Lys Leu Val Tyr 
35 2700 2705 2710 

GCT AAT GCT GGT GGA TCT TTA AGA GGG AAC ATA TGA TCT CGT GTG CAT 64 80 
Ala Asn Ala Gly Gly Ser Leu Arg Gly Asn He * Ser Arg Val His 
2715 2720 2725 



CCA TCT TCA ACT AAA AAA ATA TGT TGC ACA TCT CCC ACG TCA CTT ACT 652 8 
Pro Ser Ser Thr Lys Lys He Cys Cys Thr Ser Pro Thr Ser Leu Thr 
2730 2735 2740 



4 5 AGC TAT TTC ATC CAA GTA CTA ACT TGT GTG GTT GTC TCC TCA GTA CCG 657 6 
Ser Tyr Phe He Gin Val Leu Thr Cys Val Val Val Ser Ser Val Pro 
2745 2750 2755 2760 

GGA CAT TGT GCG CCA ATT CAT TAA AGG CAC TGA TGG ATT TGC TGG TGG 6624 
50 Gly His Cys Ala Pro He His * Arg His * Trp He Cys Trp Trp 

2765 2770 2775 

TTT TGC CGA ATG TCT TTG TGG AAG TCC ACA CCT ATA CCA GGT AAG TTG 6672 
Phe Cys Arg Met Ser Leu Trp Lys Ser Thr Pro He Pro Gly Lys Leu 
55 2780 2785 2790 

TGG CAA TAC TTG GAA ATG GGT TGA GTG AAT GTC ACA TGG ATT TTT TAT 6720 
Trp Gin Tyr Leu Glu Met Gly * Val Asn Val Thr Trp He Phe Tyr 
2795 2800 2805 



ATA TAC CAC ATG ATG ATA CAC ATG TAA ATA TAT AAC GAT TAT AGT GTA 67 68 
He Tyr His Met Met He His Met * He Tyr Asn Asp Tyr Ser Val 
2810 2815 2820 
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TGC ATA TGC ATT TGG CTA AGA AGT ACT CCC TCC CTT AGT AAA AGT TAG 6816 
Cys lie Cys lie Trp Leu Arg Ser Thr Pro Ser Leu Ser Lys Ser * 
2825 2830 2835 2840 

5 TAC AAA GTT GAG TCA TCT ATT TTG GAA CGG AGG GAG TAT AAG TGT ATA 6864 
Tyr Lys Val Glu Ser Ser lie Leu Glu Arg Arg Glu Tyr Lys Cys lie 
2845 2850 2855 

CAC TAG TGC AAT ATA TAG GTT TTA AC A CCC AAC TTG CCA ATG AAG GAA 6912 
10 His * Cys Asn lie * Val Leu Thr Pro Asn Leu Pro Met Lys Glu 
2860 2865 2870 

CAT AGG GCT TTC TAG TTA TCT TAT TTA TTT GTC TGG TGA ATA ATC CAC 6960 
His Arg Ala Phe * Leu Ser Tyr Leu Phe Val Trp * lie lie His 
15 2875 2880 2885 

TGA AAA ATT CCA GCC ATG TCA TTT TTT AGG GGG GGA GAA GAA ACT AC A 7008 
* Lys lie Pro Ala Met Ser Phe Phe Arg Gly Gly Glu Glu Thr Thr 
2890 2895 2900 



20 



60 



TTG ATT TTT CCC CCT AAA AAA AGC CAT CTC AGA TTT CAT AGG TAA CTT 7 056 
Leu lie Phe Pro Pro Lys Lys Ser His Leu Arg Phe His Arg * Leu 
2905 2910 2915 2920 



2 5 GCT TTT CTG TAA AGA AAT GAA AAC GAC TTC ATA CTT TCT GTC GAT TAT 7104 

Ala Phe Leu * Arg Asn Glu Asn Asp Phe lie Leu Ser Val Asp Tyr 
2925 2930 2935 

AAG TGT ATA CAC TAG TGC AAT ATA TAG GTT TTA ACA CCC AAC TTG CCA 7152 

3 0 Lys Cys lie His * Cys Asn lie * Val Leu Thr Pro Asn Leu Pro 

2940 2945 2950 

ATG AAG GAA CAT AGG GCT TTC TAG TTA TCT TAT TTA TTT GCT GGT GAA 7 200 
Met Lys Glu His Arg Ala Phe * Leu Ser Tyr Leu Phe Ala Gly Glu 
35 2955 2960 2965 

TAA TCC ACT GAA AAA TTC CAG CCA TGT CAT TTT TTA GGG GGG AGA AGA 7 24 8 

* Ser Thr Glu Lys Phe Gin Pro Cys His Phe Leu Gly Gly Arg Arg 

2970 2975 2980 

40 

AAC TAT ATT GAT TTT TCC CCC TAA AAA AAG CCA TCT CAG ATT CAT AGG 729 6 

Asn Tyr lie Asp Phe Ser Pro * Lys Lys Pro Ser Gin lie His Arg 

2985 2990 2995 3000 

45 AAC TTG CTT TTC TGT AAA GAA ATG AAA ACG ACT TCA TAC TTT CTG CGG 7344 
Asn Leu Leu Phe Cys Lys Glu Met Lys Thr Thr Ser Tyr Phe Leu Arg 
3005 3010 3015 

CGC TTA CTT AGC TCG ATG GAT ATT TGT AAG ATG AAT GCC AAA TTA TTT 73 9 2 
50 Arg Leu Leu Ser Ser Met Asp lie Cys Lys Met Asn Ala Lys Leu Phe 
3020 3025 3030 

GGC GGG ATT TGA TCG TTA TTC CAA ATT TCA TTT GGT TTC TCT AGC AAT 7 440 
Gly Gly lie * Ser Leu Phe Gin lie Ser Phe Gly Phe Ser Ser Asn 
55 3035 3040 3045 

CAA CCC AGT ACC TTG TTA TTG GCA CTG CAA TTT CTT ATT GAT TAA TCA 7 48 8 
Gin Pro Ser Thr Leu Leu Leu Ala Leu Gin Phe Leu lie Asp * Ser 
3050 3055 3060 



GGC AGG AGG AAG GAA ACC TTG GCA CAG TAT CAA CTT GGT ATG TGC ACA 7 53 6 
Gly Arg Arg Lys Glu Thr Leu Ala Gin Tyr Gin Leu Gly Met Cys Thr 
3065 3070 3075 3080 
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TGA TGG ATT TAC ACT GGG TGA TTT GGT ACA TAT AAT ACC AAG TCA ATT 7 534 

* Trp lie Tyr Thr Gly * Phe Gly Thr Tyr Asn Thr Lys Ser lie 
3085 3090 3095 

5 TAC CAA ATG GGG AGA CCA ATA GAG ATG GAG AAA ATC ACA ATC TTA GCT 7 63 2 
Tyr Gin Met Gly Arg Pro lie Glu Met Glu Lys lie Thr lie Leu Ala 
3100 3105 3110 

GGA ATT GTG GGG AGG TAA TTC TGA ACT CTC CTT TTT TTT TGA AAT TTT 7680 
10 Gly lie Val Gly Arg * Phe * Thr Leu Leu Phe Phe * Asn Phe 
3115 3120 3125 

CAT GCT TTA CAT AAT AGT CAA ATG GCT GAC AAA TGT CGT TGT ATG GTT 772 8 
His Ala Leu His Asn Ser Gin Met Ala Asp Lys Cys Arg Cys Met Val 
15 3130 3135 3140 

CTC TCT ACC TAA ACC GTT AAG GCA GTA AGA GTT TCC CTA CAA GAT CTC 7776 

Leu Ser Thr * Thr Val Lys Ala Val Arg Val Ser Leu Gin Asp Leu 
3145 3150 3155 3160 

20 

TTT GTT CGT ATA ATT GTA TTT TCT AGA GAA AAG TTG CCT TCA ATT TTG 7 82 4 

Phe Val Arg lie lie Val Phe Ser Arg Glu Lys Leu Pro Ser lie Leu 
3165 3170 3175 

2 5 TGC ACG CGG CAG TAC AGG AAT TGT GGT TAT AAA TAT TGA TAC AGG CTG 7 87 2 

Cys Thr Arg Gin Tyr Arg Asn Cys Gly Tyr Lys Tyr * Tyr Arg Leu 
3180 3185 3190 

ACC ATC GTT ACT AAT AGG GGG AAC AAT AAG CAC ATT TTT TTA ATA GCA 792 0 

3 0 Thr lie Val Thr Asn Arg Gly Asn Asn Lys His lie Phe Leu lie Ala 

3195 3200 3205 

AAG GCA TCA CCC TTG TTC CGT TTC CAA TGA AAT CAC AGT ATC CGA ACC 796 8 
Lys Ala Ser Pro Leu Phe Arg Phe Gin * Asn His Ser lie Arg Thr 
35 3210 3215 3220 

ATA AGT TTT ACA AGT ATG CGT AGA GAG AAA TAA AGT ATC AAC CCG GCA 8 016 
lie Ser Phe Thr Ser Met Arg Arg Glu Lys * Ser lie Asn Pro Ala 
3225 3230 3235 3240 

40 

GAA ACA GTT GTT TCA GGC GCA AAG AGA AAA GGA AAC GAT ATG CTC TAT 8064 
Glu Thr Val Val Ser Gly Ala Lys Arg Lys Gly Asn Asp Met Leu Tyr 
3245 3250 3255 

45 TAC ATC AAC CTT TTA GCA TTT AGG GAC GAC CAG CAT CAT CCC ATC TTC 8112 
Tyr lie Asn Leu Leu Ala Phe Arg Asp Asp Gin His His Pro lie Phe 
3260 3265 3270 

AAT CAA CTG GAG CGA GGT CAC CTC CAA TCT TCT CAG CAG CCT CAG AGT 8160 
50 Asn Gin Leu Glu Arg Gly His Leu Gin Ser Ser Gin Gin Pro Gin Ser 
3275 3280 3285 

GGT GAC CTC CCA AGC AAG TGC ATC AGC ATC CAT CAT CTG GGG GTT GGG 8208 
Gly Asp Leu Pro Ser Lys Cys lie Ser lie His His Leu Gly Val Gly 
55 3290 3295 3300 

CAC ATA CCA TGA GCA CAA TCA CCT GAA TTT GAT GAA TTT TCC TCT GTT 8256 
His lie Pro * Ala Gin Ser Pro Glu Phe Asp Glu Phe Ser Ser Val 
3305 3310 3315 3320 



60 



TAC CTT GCA GCA GAC CCC TGC CGT ATA AAT GGT TTT AAA TGA CAG CAT 83 04 
Tyr Leu Ala Ala Asp Pro Cys Arg lie Asn Gly Phe Lys * Gin His 
3325 3330 3335 
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GTT CTT TCA GTT TGA GCA AAA TTT GTG CAA TTG CAA AGA AGC TTT AGA 83 52 

Val Leu Ser Val * Ala Lys Phe Val Gin Leu Gin Arg Ser Phe Arg 

3340' 3345 3350 

5 ATC ATG TGG AAC ATG CAC TTA CAT TTC ATC TGA CAA TAT AGG AAG GAG 8400 
lie Met Trp Asn Met His Leu His Phe lie * Gin Tyr Arg Lys Glu 
3355 3360 3365 

AGC CCG ACG TCG CAT GCT CCT CTA GAC TCG AGG AAT TCG CAA GAT TGT 8448 
10 Ser Pro Thr Ser His Ala Pro Leu Asp Ser Arg Asn Ser Gin Asp Cys 
3370 3375 3380 

CTG TCA AAA GAT TGA GGA AGA GGC AGA TGC GCA ATT TCT TTG TTT GTC 8496 
Leu Ser Lys Asp * Gly Arg Gly Arg Cys Ala He Ser Leu Phe Val 
15 3385 3390 3395 3400 

TCA TGG TTT CTC AAG TAA GAC TTA TAT CTG ATC TCT TCA ATT TTT GAG 8544 
Ser Trp Phe Leu Lys * Asp Leu Tyr Leu He Ser Ser He Phe Glu 
3405 3410 3415 



20 



60 



ATT GCC TGT TTT TCA CAA TGG CAT ATG TTG TCA GGT GAA ACA TCC AAT 8592 
He Ala Cys Phe Ser Gin Trp His Met Leu Ser Gly Glu Thr Ser Asn 
3420 3425 3430 



2 5 CCC AGT ATT AAT AGA GCC AAC ATG AAG GGA TTG CTT ATC TGA GAT ATC 8640 
Pro Ser He Asn Arg Ala Asn Met Lys Gly Leu Leu He * Asp He 
3435 3440 3445 

TGC CAA AGT TGA ATT CTT AGA TTC ACC TTC TTC AGT ATT TCA GAC CTT 8688 
30 Cys Gin Ser * He Leu Arg Phe Thr Phe Phe Ser He Ser Asp Leu 
3450 3455 3460 

CTA AGC ATT TTC ATT TTT TTT TTC AAT TGT TAG GGA GTT CCA ATG TTT 87 3 6 
Leu Ser He Phe He Phe Phe Phe Asn Cys * Gly Val Pro Met Phe 
35 3465 3470 3475 3480 

TAC ATG GGC GAT GAA TAT GGC CAC ACA AAA GGG GGC AAC AAC AAT ACA 87 84 
Tyr Met Gly Asp Glu Tyr Gly His Thr Lys Gly Gly Asn Asn Asn Thr 
3485 3490 3495 

40 

TAC TGC CAT GAT TCT TAT GTC AGT ACA ATT TGG TCA CAT ATT GTT GTT 8832 
Tyr Cys His Asp Ser Tyr Val Ser Thr He Trp Ser His He Val Val 
3500 3505 3510 

45 CTA AGT AAC TAT CTT CAA ATC TTT GCA TTC ATC CGT CAT GGC TCT TCT 8880 
Leu Ser Asn Tyr Leu Gin He Phe Ala Phe He Arg His Gly Ser Ser 
3515 3520 3525 

GTA GGT CAA TTA TTT TCG CTG GGA TAA AAA AGA ACA ATA CTC TGA CTT 8928 
50 Val Gly Gin Leu Phe Ser Leu Gly * Lys Arg Thr He Leu * Leu 
3530 3535 3540 

GCA AAG ATT CTG CTG CCT CAT GAC CAA ATT CCG CAA GTA AGT ATT CCG 897 6 
Ala Lys He Leu Leu Pro His Asp Gin He Pro Gin Val Ser He Pro 
55 3545 3550 3555 3560 

TTG AAT AAT TTC TGT GTA GAA CCA CTG AAG GTG CCT CCA AAC GCT AAG 9024 
Leu Asn Asn Phe Cys Val Glu Pro Leu Lys Val Pro Pro Asn Ala Lys 
3565 3570 3575 



CGA GCA AGG TCA ATT TCA CAC CCT AAT CAA GTT GGT GTT GTC TAT TTG 907 2 
Arg Ala Arg Ser He Ser His Pro Asn Gin Val Gly Val Val Tyr Leu 
3580 3585 3590 
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TGT ATT TGA TCT GCT GCA CTG TAG GGA GTG CGA GGG TCT TGG CCT TGA 9120 
Cys lie * Ser Ala Ala Leu * Gly Val Arg Gly Ser Trp Pro * 
3595 3600 3605 

5 GGA CTT TCC AAC GGC CGA ACG GCT GCA GTG GCA TGG TCA TCA GCC TGG 9168 
Gly Leu Ser Asn Gly Arg Thr Ala Ala Val Ala Trp Ser Ser Ala Trp 
3610 3615 3620 

GAA GCC TGA TTG GTC TGA GAA TAG CCG ATT CGT TGC CTT TTC CAT GGT 9216 
10 Glu Ala * Leu Val * Glu * Pro lie Arg Cys Leu Phe His Gly 
3625 3630 3635 3640 

ACA CAT ATA GTT CTG AC A CTT CAC TAT AGT TGT TTT AAA AAA GAA AAT 9 264 
Thr His lie Val Leu Thr Leu His Tyr Ser Cys Phe Lys Lys Glu Asn 
15 3645 3650 3655 

TTA ACT CAA AAG TAA ATT ATG GAG A 9 2 89 

Leu Thr Gin Lys * lie Met Glu 
3660 

20 
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CLAIMS 

1. A nucleic acid sequence encoding an enzyme of the 
starch biosynthetic pathway in a cereal plant, wherein the 

5 enzyme is selected from the group consisting of starch 
branching enzyme I, starch branching enzyme II, starch 
soluble synthase I, and debranching enzyme, with the proviso 
that the enzyme is not soluble starch synthase I of rice, or 
starch branching enzyme I of rice or maize. 

10 

2. A sequence according to claim 1, wherein the 
sequence is a genomic DNA or cDNA sequence. 

3. A sequence according to claim 1 or claim 2, 
15 wherein the sequence is functional in wheat. 

4. a sequence according to any one of claims 1 to 3 , 
wherein the sequence is derived from a Triticum species. 

20 5. A sequence according to claim 4, wherein the 

Triticum species is Triticum tauschii. 

6. A sequence according to any one of claims 1 to 5 , 
wherein the sequence encodes starch branching enzyme I or a 

25 biologically-active fragment thereof, and wherein the 
sequence has at least 70% sequence homology with the 
sequence shown in SEQ ID NO: 5 or SEQ ID NO : 9 . 

7. A sequence according to claim 6, wherein the 
30 homology is at least 90%. 

8. A sequence according to any one of claims 1 to 5 , 
wherein the sequence encodes starch branching enzyme II a or 
biologically-active fragment thereof, and wherein the 

35 sequence has at least 70% sequence homology with the 
sequence shown in SEQ ID NO: 10. 
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9. A sequence according to claim 8, wherein the 
homology is at least 90%. 

10. A sequence according to any one of claims 1 to 5 
5 wherein the sequence encodes soluble starch synthase or a 

biologically-active fragment thereof, and wherein the 
sequence has at least 70% sequence homology with the 
sequence shown in SEQ ID NO: 11 or SEQ ID NO: 13. 

10 11. A sequence according to claim 10, wherein the 

homology is at least 90%. 

12. A sequence according to claim 11, wherein the 
sequence encodes a 75 kD soluble starch synthase of wheat. 

15 

13. A sequence according to claim 12, which encodes 
amino acid sequence at least 70% homologous to that shown 
SEQ ID NO: 14 . 

20 14. A sequence according to any one of claims 1 to 5 

wherein the sequence encodes debranching enzyme or a 
biologically-active fragment thereof, and wherein the 
sequence has at least 70% sequence homology with the 
sequence shown in SEQ ID No: 17. 

25 

15. A sequence according to claim 14, wherein the 
homology is at least 90%. 

16. A promoter of an enzyme selected from the group 
3 0 consisting of starch branching enzyme I, starch branching 

enzyme II, starch soluble synthase I, and debranching 
enzyme, with the proviso that the enzyme is not soluble 
starch synthase I of rice, or starch branching enzyme I of 
rice or maize. 

35 

17. A promoter according to claim 16, wherein the 
promoter is a starch branching enzyme I promoter or 
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biologically-active fragment thereof, and wherein the 
promoter sequence has at least 70% sequence homology with 
the sequence shown in SEQ ID No: 8. 

5 18. A sequence according to claim 17, wherein the 

homology is at least 90%. 

19. A promoter according to claim 16, wherein the 
promoter is a starch soluble synthase I promoter or 

10 biologically-active fragment thereof, and wherein the 

promoter sequence has at least 70% sequence homology with 
the sequence shown in SEQ ID No: 15. 

20. A sequence according to claim 19, wherein the 
15 homology is at least 90%. 

21. A nucleic acid construct comprising a nucleic acid 
sequence encoding an enzyme of the starch biosynthetic 
pathway in a cereal plant, operably linked to one or more 

20 nucleic acid sequences facilitating expression of the 

nucleic acid sequence in a plant, wherein the enzyme is 
selected from the group consisting of starch branching 
enzyme I, starch branching enzyme II, starch soluble 
synthase I, and debranching enzyme, with the proviso that 

25 the enzyme is not soluble starch synthase I of rice, or 

starch branching enzyme I of rice or maize, a biologically- 
active fragment thereof. 

22. A nucleic acid construct for targeting a gene to 
30 the endosperm of a cereal plant, comprising one or more 

promoter sequences selected from the group consisting of 
SBE I promoter, SBE II promoter, SSS I promoter, and 
DBE promoter, operatively linked to a nucleic acid sequence 
encoding a protein, wherein the expression of the targetted 
3 5 gene in the endosperm of a cereal plant is modified. 
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23. A construct according to either claim 21 or claim 
22, wherein the promoter or nucleic acid sequence is also 
operatively linked to one or more additional targeting 
sequences and/or one or more 3 1 untranslated sequences. 

5 

24. A construct according to claim 23, wherein the 
nucleic acid encoding the protein is either in the sense or 
ant is ens e orientation . 

10 25. A construct according to claims 24, wherein the 

protein is an enzyme of the starch biosynthetic pathway. 

26. A construct according to claim 25, wherein the 
nucleic acid encoding the protein is in the antisense 

15 orientation, and the enzyme is selected from the group 

consisting of GBSS, starch debranching enzyme, SBE II, low 
molecular weight glutenin, and grain softness protein I. 

27. A construct according to claim 25, wherein the 
20 nucleic acid encoding the protein is in the sense 

orientation, and the enzyme is selected from the group 
consisting of bacterial isoamylase, bacterial glycogen 
synthase, and wheat high molecular weight glutenin Bxl7 . 

28. A construct according to any one of claims 21 to 
25 27, wherein the plant is a cereal plant. 

29. A construct according to claim 28, wherein the 
cereal plant is either wheat or barley. 

30 30. A construct according to claim 29, wherein the 

cereal plant is wheat. 

31. A construct according to any one of claims 21 to 

30. wherein the construct is either a plasmid or a vector. 



35 
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32. A construct according to claim 31, wherein the 

plasmid or vector is suitable for use in the transformation 
of a plant. 



5 33 . A construct according to claim 32, wherein the 

plasmid is selected from the group consisting of those 
depicted in Figures 22a to 22f. 

34. A construct according to claim 32, wherein the 
10 vector is a bacterium of the genus Agrobacterium. 

35. A construct according to claim 34, wherein the 
vector is Agrobacterium tumefaciens. 



15 36. A method of modifying the characteristics of 

starch produced by a plant, comprising the steps of: 

(a) introducing a nucleic acid sequence encoding 
an enzyme of the search biosynthetic pathway into a host 
plant, and/or 

20 (b) introducing an anti-sense nucleic acid 

sequence directed to a gene encoding an enzyme of the starch 
biosynthetic pathway into a host plant, 

wherein the enzyme is selected from the group 
consisting of starch branching enzyme I, starch branching 

25 enzyme II, starch soluble synthase I, and debranching 

enzyme, with the proviso that the enzyme is not soluble 
starch synthase I of rice, or starch branching enzyme I of 
rice or maize, and wherein if both steps (a) and (b) are 
used, the enzymes in the two steps are different. 

30 

37. A method according to claim 36, wherein the plant 

is a cereal plant. 



38. A method according to claim 37, wherein the cereal 

plant is wheat or barley. 
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39. A method of targeting expression of a gene to the 
endosperm of a cereal plant, comprising the step of 
transforming the plant with a construct according to any one 
of claims 21 to 35. 

40. A method of modulating the time of expression of a 
gene in endosperm of a cereal plant, comprising the step of 
transforming the plant with a construct according to any one 
of claims 21 to 35. 

41. A method according to claim 40, wherein when 
expression at an early stage following anthesis is desired, 
the construct comprises either the SBE II, SSS I, or . DBE 
promoter . 

42. A method according to claim 40, wherein when 
expression at a later stage following anthesis is desired, 
the construct comprises the SBE I promoter. 

43. A plant transformed with a construct according to 
any one of claims 21 to 35. 

44. A plant according to claim 43, wherein the plant 
is a cereal plant. 

45. A plant according to claim 44, wherein the cereal 
plant is wheat or barley. 

46. A method of identifying variations in the starch 
synthesis characteristics of a cereal plant, comprising the 
step of identifying a variation in nucleic acid sequence in 
the intron regions of the SBE I, SBE II, SSS I or DBE genes. 

47. A method of identifying variations in the starch 
synthesis characteristics of a cereal plant, comprising the 
step of identifying a variation in nucleic acid sequence 
compared to the sequence shown in one or more SEQ ID NO: 5, 
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SEQ ID NO: 7, SEQ ID NO : 9 , SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID 
NO: 13, SEQ ID NO: 15, SEQ ID NO: 16, or SEQ ID NO: 17. 

48. A method according to claim 47, in which a 

5 mutation or absence of a SBE I, SBE II, SSS I or DBE gene is 
detected . 

49. A method according to either claim 47 or claim 48, 
in which the cereal plant is wheat or barley. 

10 50. A product comprising plant material propogated 

from a plant transformed with a nucleic acid sequence 
encoding an enzyme of the starch biosynthetic pathway in a 
cereal plant, operably linked to one or more nucleic acid 
sequences facilitating expression of the nucleic acid 

15 sequence in a plant, wherein the enzyme is selected from the 
group consisting of starch branching enzyme I, starch 
branching enzyme II, starch soluble synthase I, and 
debranching enzyme, with the proviso that the enzyme is not 
soluble starch synthase I of rice, or starch branching 

20 enzyme I of rice or maize, a biologically-active fragment 
thereof . 

51. A product comprising plant material propogated 
from a plant in which a gene was targeted to the endosperm 
of a cereal plant, by a nucleic acid construct comprising 

2 5 one or more promoter sequences selected from the group 
consisting of SBE I promoter, SBE II promoter, SSS I 
promoter, and DBE promoter, operatively linked to a nucleic 
acid sequence encoding a protein, wherein the expression of 
the targetted gene in the endosperm of a cereal plant is 

30 modified. 

52. A product according to claim 50 or claim 51 
wherein the product is a food product. 
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E01 ' /E02 


WBE2E1F 


CGT CGC TGC TCC TCA GGA AG 


2 


E01/E02 


sr854 . 1180F 


CTG GCT GAC TCA ATC ACT ACG 
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E02/E03 


WBE2E2F 


CGC AAC CTG AAG AAT TAC AG 
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ATT TTC GGA GCC ATC TTG AC 
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7 


E05/I05 


sr913F 


ATC ACT TAC CGA GAA TGG G 
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SBE 11 Intron 5 primer set - digested with Dde1 
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